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Preface 

Pune 

26/May/2003 

These notes are derived from a school on low frequency radio astronomy that was held 
at NCRA, Pune from June 21 to July 17 1999. The school was funded by SERC, DST. 
Speakers at the school had been asked to provide a set of lecture notes prior to their 
lectures, and, somewhat to our own surprise, many of them actually did. Our plan had 
been to compile these notes and, at the end of the school, to issue them in book form. 
For various reasons, this didn't happen. The main problem was that while about half of 
the notes were nicely LaTeXed up with embedded figures, the other half varied 
enormously in quality. There was everything from half written plain text notes with cut- 
paste graphics to stapled bundles of xeroxes of the slides used during the lecture. Our 
editorial burden was hence considerable, and so we are especially glad to find that it is 
finally over, and that we need no longer feign temporary deafness when the topic of 
SERC school notes comes up. 

But to place the blame where it should rightly be placed, it must be admitted that this 
all started with our insistence on each speaker preparing a set of notes. There are 
several excellent books on radio astronomy and interferometry, and the US based 
NRAO regularly puts out a definitive set of "Synthesis Imaging" notes. Why then bother 
with producing something else? Well we had two major reasons. The first was that 
excellent though these books and notes may be, many Indian students do not have 
access to them. On the other hand there was a very clear need for us to have available 
some pedagogical material that we could freely distribute. The other was that we felt it 
would be nice to have lecture notes that were specifically focused on what we at the 
GMRT felt was important to us. 

This second fixation of ours has influenced these notes in two ways. The first, more 
subtle effect, is that we have tried, (where possible), to stress issues that are of 
concern in low frequency radio astronomy, but which may be less important, or even 
irrelevant at higher frequencies. The other is that there is an entire section of these 
notes that is devoted exclusively to describing the GMRT. This section has been written 
for the more general reader, i.e. one who does not want to wade through arcane 
technical notes and reports (assuming that s/he is fortunate enough to need 
information on a topic for which some documentation exists!) to get an overview of the 
GMRT. 

We hope that these notes go someway towards meeting these two aims, and that 
students of radio astronomy as well as GMRT users and new technical staff will find 
them useful. 

All that remains now is to thank all those who have contributed to this enterprise:- the 
speakers from the school for writing the notes to start with, the legions of NCRA friends 
and colleagues who cheerfully proof read various versions, B. Premkumar who made 
some of the figures and arranged for them to be printed, Annabhat Joshi who designed 
the cover and helped with getting the final master copy ready for the printer, SERC for 
funding the school, NCRA for providing financial and other support, and finally, all the 
project students from the last few years who bugged us for copies of the notes. 

We have done our best to eliminate typographical and other errors, but none the less 
we are sure that several remain, for which we do, of course, admit complete 
culpability. We would be grateful if readers who notice such errors could bring them to 



our notice. 

Jayaram N Chengalu 
Yashwant Gupta 
K. S. Dwarakanath 



Chapter 1 

Signals in Radio Astronomy 


Rajaram Nityananda 


1.1 Introduction 

The record of the electric field E(t), received at a point on earth from a source of radio 
waves can be called a “signal”, so long as we do not take this to imply intelligence at 
the transmitting end. Emanating as it does from a large object with many independently 
radiating parts, at different distances from our point, and containing many frequencies, 
this signal is naturally random in character. In fact, this randomness is of an extreme 
form. All measured statistical properties are consistent with a model in which different 
frequencies have completely unrelated phases, and each of these phases can vary ran¬ 
domly from 0 to 2n. A sketch of such a signal is given in Fig. 1.1. The strength (squared 
amplitude or power) of the different frequencies u> has a systematic variation which we 
call the “power spectrum” S(u>). This chapter covers the basic properties of such sig¬ 
nals, which go by the name of “time-stationary gaussian noise”. Both the signal from 
the source of interest, as well as the noise added to this cosmic signal by the radio tele¬ 
scope recievers can be described as time-stationary gaussian noise. The word noise of 
course refers to the random character. “Noise” also evokes unwanted disturbance, but 
this of course does not apply to the signal from the source (but does apply to what our 
receivers unavoidably add to it). The whole goal of radio astronomy is to receive, process, 
and interpret these cosmic signals, (which were, ironically enough, first discovered as 
a “noise” which affected trans-atlantic radio communication). ‘Time-Stationary” means 
that the signal in one time interval is statistically indistinguishable from that in another 
equal duration but time shifted interval. Like all probabilistic statements, this can never 
be precisely checked but its validity can be made more probable (circularity intended!) 
by repeated experiments. For example, we could look at the probability distribution of 
the signal amplitude. An experimenter could take a stretch of the signal say, from times 
0 to T, select N equally spaced values E(t l ), i going from 1 to N, and make a histogram 
of them. The property of time stationarity says that this histogram will turn out to be 
(statistically) the same — with calculable errors decreasing as N increases! — if one had 
chosen instead the stretch from t to t + T, for any t. The second important characteristic 
property of our random phase superposition of many frequencies is that this histogram 
will tend to a gaussian, with zero mean as N tends to infinity. 
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Time -► 

Figure 1.1: A signal made by superposition of many frequencies with random phases 


1.2 Properties of the Gaussian 

The general statement of gaussianity is that we look at the joint distribution of N ampli¬ 
tudes xi = E(ti), X 2 = E(t 2 ), ■ ■ ■ etc. This is of the form 


P(x i... Xk) = const x exp (— Q(x\, X 2 , ■ ■ ■ Xk )) 

Q is a quadratic expression which clearly has to increase to +oo in any direction in the 
k dimensional space of the x's. For just one amplitude, 


p ( xi) 


1 C ~PJ2P 


does the job and has one parameter, the “Variance’V, the mean being zero. This 
variance is a measure of the power in the signal. For two variables, xi and , the general 
mathematical form is the “bivariate gaussian” 


P(x i, X 2 ) = const x exp --(oip] + 2ci\2X\X2 + CI 22 X 2 ) 


Such a distribution can be visualised as a cloud of points in x\ — x 2 space, whose 
density is constant along ellipses Q =constant (see Fig. 1.2). 

The following basic properties are worth noting (and even checking!). 

1. We need an, a 2 2 , and ana 2 2 — a\ 2 all > 0 to have ellipses for the contours of constant 
P ( hyperbolas or parabolas would be a disaster, since P would not fall off at infinity). 

2. The constant in front is 

(1/2tt) x 
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Figure 1.2: Contour lines of a bivariate gaussian distribution 


3. The average values of x\,x\ and x\x 2 , when arranged as a matrix (the so called 
covariance matrix) are the inverse of the matrix of a’s. For example, 

(x\) = a 22 /det A 

( X1X2 ) = 012 /det A 

etc. 

4. By time stationarity, 

2 2 2 

< X\ >=< x 2 >= CT 

< a?? >=< a ’2 >= cr 2 

The extra information about the correlation between X\ and x 2 is contained in < 
X\X 2 >, i.e. in a\ 2 which (again by stationarity) can only be a function of the time 
separation r = t\ — t 2 - We can hence write < E(t)E(t + r) >= C(t) independent of t. 
C{t) is called the autocorrelation function. From (1) above, C 2 (t) < a 2 . This suggests 
that the quantity r(r) = C(t)/<j 2 is worth defining, as a dimensionless correlation 
coefficient, normalised so that r( 0) = 1. The generalisation of all these results for a 
k variable gaussian is given in the Section 1.8 


1.3 The Wiener-Khinchin Theorem 

So far, we have only asserted that the sum of waves with random phases generates a 
time-stationary gaussian signal. We now have to check this. It is convenient to start with 
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a signal going from 0 to T, and only later take the limit T —» oo. The usual theory of 
Fourier series tells us that we can write 

E(t) = a n cos u) n t + b n sin u> n t 

— r " C0S ( U; ™ < + Pri) 

where, 

27T _ 

u n = y, r n = \JanX 1 + b^,and t&n ip n = -b n /a n 

Notice that the frequencies come in multiples of the “fundamental” 2tt/T which is very 
small since T is large, and hence they form a closely spaced set. We can now compute 
the autocorrelation 

C(t) = (E(t)E(t + t)) = (^2 r n COS (w n f + (fin) E r m cos(w m (f + r) + ifi m )) 

n m 

The averaging on the right hand side has to be carried out by letting each of the phases 
fik vary independently from 0 to 2ir. When we do this, only terms with rn = n can survive, 
and we get 


C ( T ) = 2 r " COSUJnT 

Putting r equal to zero, we get the variance 

c(o) = mf) = Y,\/n 

We note that the autocorrelation is independent of t and hence we have checked time 
stationarity, at least for this statistical property. We now have to face the limit T —> oo. 
The number of frequencies in a given range Aw blows up as 

Aw TAw 
(2tt/T) “ 2tt 

Clearly, the have to scale inversely with T if statistical qualities like C(r) are to have 
a well defined T -* oo behaviour. Further, since the number of r n ’s even in a small interval 
Aw blows up, what is important is their combined effect rather than the behaviour of any 
individual one. All this motivates the definition. 

E y = 2 S(w)Aw 

u)<.u) n <w+Aw 

as T —> oo. Physically, 2 .S'(wjAw is the contribution to the variance ( E 2 (t )} from the 
interval w to w + Aw. Hence the term “power spectrum” for S(w). Our basic result for the 
autocorrelation now reads 

/-•OO A + OO 

C(t) = / 2 S(u) coswrdw = / S(u)e~ tulT duj 

J 0 J —oo 

if we define S'(-w) = 5 (w). 

This is the “Wiener-Khinchin theorem” stating that the autocorrelation function is 
the Fourier transform of the power spectrum. It can also be written with the frequency 
measured in cycles (rather than radians) per second and denoted by v. 
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/»oo /*+oo 

C(t) = / 2 P{v) cos(2irb'T)dv = / P(h')e^ 2mvT dv 

J 0 j —GO 

and as before, P(—v) = P(v). 

In this particular case of the autocorrelation, we did not use independence of the p ’s. 
Thus the theorem is valid even for a non-gaussian random process, (for which different 
tp ’s are not independent). Notice also that we could have averaged over t instead of 
over all the p’s and we would have obtained the same result, viz. that contributions 
are nonzero only when we multiply a given frequency with itself. One could even argue 
that the operation of integrating over the p’s is summing over a fictitious collection (i.e 
“ensemble”) of signals, while integrating over t and dividing by T is closer to what we do 
in practice. The idea that the ensemble average can be realised by the more practical 
time average is called “ergodicity” and like everything else here, needs better proof than 
we have given it. A rigorous treatment would in fact start by worrying about existence of 
a well-defined T —> oo limit for all statistical quantities, not just the autocorrelation. This 
is called “proving the existence of the random process”. 

The autocorrelation C(t ) and the power spectrum S(u) could in principle be measured 
in two different kinds of experiments. In the time domain, one could record samples of 
the voltage and calculate averages of lagged products to get C. In the frequency domain 
one would pass the signal through a filter admitting a narrow band of frequencies around 
to, and measure the average power that gets through. 

A simple but instructive application of the Wiener Khinchin theorem is to a power spec¬ 
trum which is constant (“flat band”) between u 0 — B/2 and z / 0 + B/2. A simple calculation 
shows that 

C(t) = 2KB (cos(27Wot)) 

The first factor 2KB is the value at r = 0, hence the total power/variance to radio 
astronomers/statisticians. The second factor is an oscillation at the centre frequency. 
This is easily understood. If the bandwidth B is very small compared to u 0 , the third factor 
would be close to unity for values of r extending over say 1/4 B, which is still many cycles 
of the centre frequency. This approaches the limiting case of a single sinusoidal wave, 
whose autocorrelation is sinusoidal. The third sine function factor describes “bandwidth 
decorrelation 1 ”, which occurs when r becomes comparable to or larger than 1/B. 

Another important case, in some ways opposite to the preceding one, occurs when 
;/ 0 = B/2, so that the band extends from 0 to B. This is a so-called “baseband”. In this 
case, the autocorrelation is proportional to a sine function of 2itBt. Now, the correlation 
between a pair of voltages measured at an interval of 1/2 B or any multiple (except zero!) 
thereof is zero, a special property of our flat band. In this case, we see very clearly that a 
set of samples measured at this interval of 1/2 B, the so-called “Nyquist sampling interval”, 
would actually be statistically independent since correlations between any pair vanish 
(this would be clearer after going through Section 1.8). Clearly, this is the minimum 
number of measurements which would have to be made to reproduce the signal, since if 
we missed one of them the others would give us no clue about it. As we will now see, it is 
also the maximum number for this bandwidth! 


1.4 The Sampling Theorem 

This more general property of a band-limited signal (one with zero power outside a band¬ 
width B) goes by the name of the “Shannon Sampling Theorem”. It states that a set of 

1 also called "fringe washing” In Chapter 4 
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samples separated by 1/2 B is sufficient to reconstruct the signal. One can obtain a pre¬ 
liminary feel for the theorem by counting Fourier coefficients. The number of parameters 
defining our signal is twice the number of frequencies, (since we have an a and a b, or 
an r and a p, for each u n ). Hence the number of real values needed to specify our signal 
for a time T is 


2 x 


A ujT 
2n 


= 2 



T = 2BT 


This rate at which new real numbers need to be measured to keep pace with the signal 
is 2 B. The so called “Nyquist sampling interval” is therefore (2 B)~ l . A real proof (sketched 
in Section 1.8) would give a reconstruction of the signal from these samples! 

In words, the Shannon criterion is two samples per cycle of the ma xi mum frequency 
difference present. The usual intuition is that the centre frequency ;/ ( , does not play a 
role in these considerations. It just acts a kind of rapid modulation which is completely 
known and one does not have to sample variations at this frequency. This intuition 
is consistent with radio engineers/astronomers fundamental right to move the centre 
frequency around by heterodyning 2 with local (or even imported 3 ) oscillators, but a more 
careful examination shows that the centre frequency should satisfy ;/ 0 = (n + \)B for the 
sampling at a rate 2 B to work. 


1.5 The Central Limit and Pairing Theorems 

We now come to the statistics of E(t). For example, we already know that ( E 2 (t )) = // r 2 /2. 
How about ( E 3 (t .))? Quite easy to check that it is zero because 

( nr m r n cos(w m f + p m ) cos (u> n t + p n ) cos (unt + pi)) = 0 

when we let the p's each vary independently over the full circle 0 to 2n. This is true 
whether l,m,n are distinct or not. But coming to even powers like ( E 4 (t )), something 
interesting happens. When we integrate a product like rir m r n r p cos(u m t + p m )cos{uj n t + 
p n ) cos(coet + pi)cos(uj p t + p p ) over all the four p's we can get non-zero answers, provided 
the p’s occur in pairs, i.e., if l = m and n = p, then we encounter cos 2 pi x cos 2 p n which has 
a non-zero average. (We saw a particular case of this when we calculated ( E(t)E(t + r)) 
and only r\ 2 , type terms survived). 

Because of the random and independent phases of the large number of different fre¬ 
quencies, we can now state the “pairing theorem”. 


(E{h)E(T 2 )... E(t 2k )) = Y, (E(ti)E(t 2 ))... (E(t 2k _ 1} E(t 2k )) 

pairs 

As discussed in Section 1.8, this pairing theorem proves that the statistics is gaussian. 
(A careful treatment shows that only the r^r 2 terms are equal on the two sides- we have 
not quite got the r/, terms right, but there are many more (of the order of N times more) 
of the former type and they dominate as T — > oo and the numbers of sines and cosines 
we are adding is very large). This result — that the sum of a large number of small, finite 
variance, independent terms has a gaussian distribution — is a particular case of the 
“central limit theorem”. We only need the particular case where these terms are cosines 
with random phases. 

2 see Chapter 3 

3 aaaaagggh! beware of weak puns, (eds.) 
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1.6 Quasimonochromatic and Complex Signals 

For a strictly monochromatic signal, electrical engineers have known for a long time that 
it is very convenient to use a complex voltage V(t) = E 0 exp(i(cot + <p)) whose real part 
gives the actual signal E r (t) = _E 0 cos(wf + ip). One need not think of the imaginary part 
as a pure fiction since it can be obtained from the given signal by a phase shift of 7r/2, 
viz. as Ei(t) = /'Jo cos(wt + <j> - tt/ 2). In practice, since one invariably deals with signals 
at an intermediate frequency derived by beating with a local oscillator, both the real and 
imaginary parts are available by using two such oscillators zr/2 out of phase. Squaring 
and adding the real and imaginary parts give E%(t) + Ef(t) = V(t)*V(t) = E‘f } which is the 
power averaged over a cycle. This is actually closer to what is practically measured than 
the instantaneous power, which fluctuates at a frequency 2 w. 

These ideas go through even when we have a range of frequencies present, by simply 
imagining the complex voltages corresponding to each of the monochromatic components 
to be added. In mathematical terms, this operation of deriving Ei(t) from E r (t) goes 
by the name of the “Hilbert Transform”, and the time domain equivalent is described 
in Section 1.8 But the physical interpretation is easiest when the different components 
occupy a range Aw - the so called “bandwidth” - which is small compared to the “centre 
frequency” w 0 . Such a signal is called “quasimonochromatic”, and can be represented as 
below 

E q (t) = Re exp(zwot) E(u> i) exp(zwit + iip(uji)) 

— Acd/2<Cdl < Acd/2 

In this expression, wi is a frequency offset from the chosen centre w 0 , so that E(u> i) 
actually represents the amplitude at a frequency w 0 + wi , and yfwj ) the phase. We can 
now think of our quasimonochromatic signal as a rapidly varying phasor at the centre 
frequency w 0 , modulated by a complex voltage 

Vm(t) = E(u> i) exp(zwif + iipuii) 

— Acd/2<o;i < Acd/2 

This latter phasor varies much more slowly than exp (— zw 0 f). In fact, it takes a time 
Aw -1 for V m (t ) to vary significantly since the highest frequencies present are of order Aw. 
This time scale is much longer than the timescale w" 1 associated with the centre fre¬ 
quency. Writing V m (t) in the polar form as R(t) exp(za(t)), our original real signal reads 

E q (t) = R(t) cos(wof + a(t)) 


We can think of R and a as time dependent, slowly varying, amplitude and phase 
modulation of an otherwise (hence “quasi”) monochromatic signal. 

While the mathematics did not assume smallness of Aw, the physical interpretation 
does. If R(t) changes significantly during a cycle, some of its values may not be attained 
as ma xi ma and hence its square cannot be regarded as measuring average power. This 
is as it should be. No amount of algebra can uniquely extract two real functions R(t) and 
a(t) from a single real signal without further conditions (and the condition imposed is 
explained in section 1.8). 

But returning to the quasimonochromatic case, we can now think of V rr dty V rn (t) as 
the (slowly) time varying power in the signal. Likewise we can think of V m (t + t)) as 

the autocorrelation. (A little algebra checks that this is the same as the autocorrelation 
of the original real signal). One advantage in working with the complex signal is that the 
centre frequency cancels in any such product containing one voltage and one complex 
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conjugate voltage. We can therefore think of such products as referring to properties 
of the fluctuations of the signal amplitude and phase, and measure them even after 
heterodyning has changed the centre frequency. 


1.7 Cross Correlations 

We have so far thought of the signal as a function of time, after it enters the antenna. Let 
us now liberate ourselves from one dimension (time) and think of the electric field as ex¬ 
isting in space and time, before it is collected by the antenna. In this view, one can obtain 
a delayed version of the signal by moving along the longitudinal direction (direction of the 
source). Thus, the frequency content is obtained by Fourier transforming a longitudinal 
spatial correlation. As explained in Chapter 2, the spatial correlations transverse to the 
direction of propagation carry information on the angular power spectrum of the signal, 
i.e. the energy as a function of direction in the sky. With hindsight, this can be viewed 
as a generalisation of the Wiener- Khinchin theorem to spatial correlations of a complex 
electric field which is the sum of waves propagating in many different directions. Histori¬ 
cally, it arose quite independently (and about at the same time!) in the context of optical 
interference. This is the van Cittert-Zernike theorem of Chapter 2. Since one is now 
multiplying and averaging signals coming from different antennas, this is called a “cross 
correlation function”. To get a non-vanishing average, one needs to multiply E 1 (x,t) by 
E^iy, t). The complex conjugate sign in one of the terms ensures that this kind of product 
looks at the phase difference. Writing out each signal as a sum with random phases, the 
terms which leave a non-zero average are the ones in which an e lv>n in an E cancels a 
e -iv>n j n an £* An (ill-starred?) product of two complex E's with zero (or two!) complex 
conjugate signs would average to zero. 


1.8 Mathematical details 

This section gives some more mathematical details of topics mentioned in the main text 
of the chapter. 

We first give the generalisation of the two variable gaussian to the joint distribution of 
k variables. Defining the covariance matrix Cg = ( XiXj ), and A = C ~ l , then we have 

P(x i.. .Xk) = (27t )~ k / 2 (det A) 1 / 2 exp ^x T Ax 

The quadratic function Q in the exponent has been written in matrix notation with T 
for transpose. In full, it is Q Xi&ijXj. Notice that the only information we need for 

the statistics of the amplitudes at k different times is the autocorrelation function C'(r), 
evaluated at all time differences L - tj. Formally this is stated as “the gaussian process 
is defined by its second order statistics”. 

What would be practically useful is an explicit formula for the average value of an arbi¬ 
trary product XiXjXi ... in terms of the second order statistics (xiX 2 )(x 3 Xr)... etc. The first 
step is to see that a product of an odd number of x’s averages to zero. (The contributions 
from #i... Xk & —xi... — Xk cancel). 

For the case of an even number of gaussian variables to be multiplied and averaged, 

there is a standard trick to evaluate an integral like f P(x±... Xk)'J': i xy ... dx\ _Define the 

Fourier transform of P, 


P(x i... Xk)e 


—ik\X\...ikkXk 


G{k\ ...k k ) 


dx i ...dxk 



1.8. MATHEMATICAL DETAILS 


9 


It is a standard result, derived by the usual device of completing the square, that this 
Fourier transform is itself a gaussian function of the k’ s, given by 


G(ki,...,k k ) = exp 


^ Cijkikj 



Differentiating with respect to h\ and then fc 2 , and putting all k ’s equal to zero, pulls down 
a factor — x\x% into the integral and gives the desired average of x\x 2 - This trick now gives 
the average of the product of a string of x’s in the form of the “pairing theorem”. This is 
easier to state by an example. 


(X1X2X3X4) = {XiX 2 ){x 3 X 4 ) + {X\X 3 ){X2X 4 ) + {x 1 X 4 ,)(X2X 3 ) 

= C12C34 + C13C24 + C14C23 

A sincere attempt to differentiate G with respect to k\k 2 k 3 and k \ and then put all k’s to 
zero will show that the C’s get pulled down in precisely this combination. Deeper thought 
shows that the pairing rule works even when the xs are not all identical, i.e., 

(a; 4 ) = (x 2 )(x 2 ) + (x 2 )(x 2 ) + (x 2 )(x 2 ) = 3(x 2 ) 2 = 3<r 4 

or even ( x 2n ) = 1,3,5... (2 n - 1 )a 2n . 

The last property is easily checked from the single variable gaussian 

(27rer 2 ) -1 / 2 exp(— x 2 /2a 2 ) 

Since the pairing theorem allows one to calculate all averages, it could even be taken 
to define a gaussian signal, and that is what we do in the main text. 

We now sketch a proof of the sampling theorem. Start with a band limited (i.e con¬ 
taining only frequencies less than B) signal sampled at the Nyquist rate, E r (n/2B). The 
following expression gives a way of constructing a continuous signal E c (t) from our 
samples. 


E c (t) = ^2E r (n/2B) smc(2nB(t — ^)) 

n 

It is also known as Whitaker’s interpolation formula. Each sine function is diabolically 
chosen to give unity at one sample point and zero at all the others, so E c (t) is guaranteed 
to agree with our samples of E r (t). It is also band limited (Fourier transform of a flat 
function extending from —B to +B). All that is left to check is that it has the same 
Fourier coefficients as E r (t) (it does). And hence, we have reconstructed a band limited 
function from its Nyquist samples, as promised. 

We add a few comments on the notion of Hilbert transform mentioned in the context 
of associating a complex signal with a real one. It looks rather innocent in the frequency 
domain, just subtract 7 t/ 2 from the phase of each cosine in the Fourier series of E r (t) and 
reassemble to get E,(f). In terms of complex Fourier coefficients, it is a multiplication 
of the positive frequency component by —i and of the corresponding negative frequency 
component by +i. Apart from the i, this is just multiplication by a step function of the 
symmetric type, jumping from minus 1 to plus 1 at zero frequency. Hence, in the time 
domain, it is a convolution of E r (t) by a kernel which is the Fourier transform of this 
step function, viz 1/f (the value t=0 being excluded by the usual principal value rule). 
Explicitly, we have 
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Ei(t) = J E r (s)P[l/(t- s)] ds/ir 

There is a similar formula relating E r to E, which only differs by a minus sign. This 
is sufficient to show that one needs values from the infinite past, and more disturbingly, 
future, of t to compute E,(t). This is beyond the reach of ordinary mortals, even those 
equipped with the best filters and phase shifters. Practical schemes to derive the complex 
signal in real time thus have to make approximations as a concession to causality. 

As remarked in the main text, there are many complex signals whose real parts would 
give our measured E r (t). The choice made above seemed natural because it was motivated 
by the quasimonochromatic case. It also has the mathematical property of creating a 
function which is very well behaved in the upper half plane of t regarded as a complex 
variable, (should one ever want to go there). The reason is that V(t) is constructed to 
have terms like e lult with only positive values of oj. Hence the pedantic name of “analytic 
signal” for this descendant of the humble phasor. It was the more general problem of 
continuing something given on the real axis to be well behaved in the upper half plane 
which attracted someone of Hilbert’s IQ to this transform. 



Chapter 2 

Interferometry and Aperture 
Synthesis 


A. P. Rao 

2.1 Introduction 

Radio astronomy is the study of the sky at radio wavelengths. While optical astronomy 
has been a field of study from time immemorial, the “new” astronomies viz. radioas¬ 
tronomy, X-ray, IR and UV astronomy are only about 50 years old. At many of these 
wavelengths it is essential to put the telescopes outside the confines of the Earth’s at¬ 
mosphere and so most of these “new” astronomies have become possible only with the 
advent of space technology. However, since the atmosphere is transparent in the radio 
band (which covers a frequency range of 10 MHz to 300 GHz or a wavelength range of 
approximately 1mm to 30m) radio astronomy can be done by ground based telescopes 
(see also Chapter 3). 

The field of radioastronomy was started in 1923 when Karl Jansky, (working at the Bell 
Labs on trying to reduce the noise in radio receivers), discovered that his antenna was 
receiving radiation from outside the Earth’s atmosphere. He noticed that this radiation 
appeared at the same sidereal (as opposed to solar ) time on different days and that its 
source must hence lie far outside the solar system. Further observations enabled him to 
identify this radio source as the centre of the Galaxy. To honour this discovery, the unit 
of flux density in radioastronomy is named after Jansky where 

1 Jansky = 10 _26 PEm _2 ^ _1 (2.1.1) 

Radio astronomy matured during the second world war when many scientists worked 
on projects related to radar technology. One of the major discoveries of that period (made 
while trying to identify the locations of jamming radar signals), was that the sun is a 
strong emitter of radio waves and its emission is time variable. After the war, the scien¬ 
tists involved in these projects returned to academic pursuits and used surplus equip¬ 
ment from the war to rapidly develop this new field of radioastronomy. In the early 
phases, radioastronomy was dominated by radio and electronic engineers and the as¬ 
tronomy community, (dominated by optical astronomers), needed considerable persua¬ 
sion to be convinced that these new radio astronomical discoveries were of relevance to 
astronomy in general. While the situation has changed considerably since then much 
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of the jargon of radio astronomy (which is largely borrowed from electrical engineering) 
remains unfamiliar to a person with a pure physics background. The coherent detection 
techniques pioneered by radio astronomers also remains by and large not well under¬ 
stood by astronomers working at other wavelength bands. This set of lecture notes aims 
to familiarize students of physics (or students of astronomy at other wavelengths) with 
the techniques of radio astronomy. 


2.2 The Radio Sky 

The sky looks dramatically different at different wave bands and this is the primary rea¬ 
son multi-wavelength astronomy is interesting. In the optical band, the dominant emit¬ 
ters are stars, luminous clouds of gas, and galaxies all of which are thermal sources 
with temperatures in the range 10 3 — 10 4 K. At these temperatures the emitted spectrum 
peaks in the optical band. Sources with temperatures outside this range and emitters of 
non thermal radiation are relatively weak emitters in the optical band but can be strong 
emitters in other bands. For example, cold (~ 100 K) objects emit strongly in the infra red 
and very hot objects ( > 10 5 K) emit strongly in X-rays. Since the universe contains all of 
these objects one needs to make multiband studies in order to fully understand it. 

For a thermal source with temperature greater than 100 K, the flux density in the 
radio band can be well approximated by the Rayleigh-Jeans Law 1 , viz. 

S = {2kT/\ 2 )dU (2.2.2) 

The predicted flux densities at radio wavelengths are miniscule and one might hence 
imagine that the radio sky should be dark and empty. However, radio observations reveal 
a variety of radio sources all of which have flux densities much greater than given by 
the Rayleigh-Jeans Law, i.e. the radio emission that they emit is not thermal in nature. 
Today it is known that the bulk of radio emission is produced via the synchrotron mecha¬ 
nism. Energetic electrons spiraling in magnetic fields emit synchrotron radiation. Unlike 
thermal emission where the flux density increases with frequency, for synchrotron emit¬ 
ters, the flux density increases with wavelength (see Figure 2.1). Synchrotron emitting 
sources are hence best studied at low radio frequencies. 

The dominant sources seen in the radio sky are the Sun, supernova remnants, radio 
galaxies, pulsars etc. The Sun has a typical flux density of 10 5 Jy while the next strongest 
sources are the radio galaxy Cygnus A and the supernova remnant Cassiopeia A, both 
of which have flux densities of ~ 10 4 Jy. Current technology permits the detection of 
sources as weak as a few /jJy. It turns out also that not all thermal sources are too weak 
to detect, the thermal emission from large and relatively nearby HII regions can also be 
detected easily in the radio band. 

Radio emission from synchrotron and thermal emitters is “broad band”, i.e. the emis¬ 
sion varies smoothly (often by a power law) over the whole radio band. Since the spec¬ 
trum is relatively smooth, one can determine it by measurements of flux density at a 
finite number of frequencies. This is a major advantage since radio telescopes tend to be 
narrow band devices with small frequency spreads (Av/v ~ 0.1). This is partly because 
it is not practical to build a single radio telescope that can cover the whole radio-band 
(see eg. Chapter 3) but mainly because radio astronomers share the radio band with a 
variety of other users ( eg. radar, cellular phones, pagers, TV etc.) all of who radiate at 
power levels high enough to completely swamp the typical radio telescope. By interna¬ 
tional agreement, the radio spectrum is allocated to different users. Radio astronomy has 

1 The Rayleigh-Jeans Law, as can be easily verified, is the limit of the Plank law when hv << kT. This 
inequality is easily satisfied in the radio regime for generally encountered astrophysical temperatures. 
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Figure 2.1: Intensity as a function of frequency (“power spectra”) for synchrotron (dashed) 
and thermal (solid) radio sources. 


a limited number of protected bands where no one else is permitted to radiate and most 
radio telescopes work only at these protected frequencies. 

Several atoms and molecules have spectral lines in the radio band. For example, the 
hyperfine transition of the Hydrogen atom corresponds to a line with a wavelength of 
~ 21cm. Since atomic hydrogen (HI) is an extremely abundant species in the universe 
this line is one of the brightest naturally occurring radio lines. The HI 21cm line has 
been extensively used to study the kinematics of nearby galaxies. High quantum number 
recombination lines emitted by hydrogen and carbon also fall in the radio band and can 
be used to study the physical conditions in the ionized interstellar medium. Further the 
radio line emission from molecules like OH, SiO, H 2 0 etc. tend to be maser amplified 
in the interstellar medium and can often be detected to very large distances. Of course, 
these lines can be studied only if they fall within the protected radio bands. In fact, the 
presence of radio lines is one of the justifications for asking for protection in a specific 
part of the radio spectrum. While many of the important radio lines have been protected 
there are many outside the protected bands that cannot be studied, which is a source of 
concern. Further, with radio telescopes becoming more and more sensitive, it is possible 
to study lines like the 21cm line to greater and greater distances. Since in the expanding 
universe, distance translates to a redshift, this often means that these lines emitted 
by distant objects move out of the protected radio band and can become unobservable 
because of interference. 


2.3 Signals in Radio Astronomy 

A fundamental property of the radio waves emitted by cosmic sources is that they are 
stochastic in nature, i.e. the electric field at Earth due to a distant cosmic source can 
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be treated as a random process 2 . Random processes can be simply understood as a 
generalization of random variables. Recall that a random variable x can be defined as 
follows. For every outcome o of some given experiment (say the tossing of a die) one 
assigns a given number to x. Given the probabilities of the different outcomes of the 
experiment one can then compute the mean value of x, the variance of x etc. If for every 
outcome of the experiment instead of a number one assigns a given function to x, then 
the associated process x(t) is called a random process. For a fixed value of t, x(t) is simply 
a random variable and one can compute its mean, variance etc. as before. 

A commonly used statistic for random processes is the auto-correlation function. The 
auto-correlation function is defined as 

r xx (t,r) = (x{t)x(t + t)) 

where the angular brackets indicate taking the mean value. For a particularly impor¬ 
tant class of random processes, called wide sense stationary (WSS) processes the auto¬ 
correlation function is independent of changes of the origin of t and is a function of r 
alone, i.e. 

r xx {r) = (x(t)x(t + t)) 

For r = 0, r(r) is simply the variance a 2 of x(t) (which for a WSS process is independent 
oft). 

The Fourier transform S(v) of the auto-correlation function is called the power spec¬ 
trum, i.e. 

/ OO 

r xx {T)e- i2 * TV d,T 

-OO 

Equivalently, S(u) is the inverse Fourier transform of r(r) or 

/ OO 

S{v)e*** TV dv 

-OO 


Hence 



i.e. since a 2 is the “power” in the signal, S(v) is a function describing how that power is 
distributed in frequency space, i.e. the “power spectrum”. 

A process whose auto-correlation function is a delta function has a power spectrum 
that is flat - such a process is called “white noise”. As mentioned in Section 2.2, many 
radio astronomical signals have spectra that are relatively flat; these signals can hence be 
approximated as white noise. Radio astronomical receivers however have limited band- 
widths, that means that even if the signal input to the receiver is white noise, the sig¬ 
nal after passing through the receiver has power only in a finite frequency range. Its 
auto-correlation function is hence no longer a delta function, but is a sine function (see 
Section 2.5) with a width ~ 1/A v, where Nv is the bandwidth of the receiver. The width 
of the auto-correlation function is also called the “coherence time” of the signal. The 
bandwidth Nv is typically much smaller than the central frequency v at which the ra¬ 
dio receiver operates. Such signals are hence also often called “quasi-monochromatic” 
signals. Much like a monochromatic signal can be represented by a constant complex 
phasor, quasi-monochromatic signals can be represented by complex random processes. 

Given two random processes x(t) and y(t), one can define a cross-correlation function 

r xy (T) = (x(t)y(t-r)) 

2 see Chapter 1 for a more detailed discussion of topics discussed in this section. 
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where one has assumed that the signals are WSS so that the cross-correlation function is 
a function of r alone. The cross-correlation function and its Fourier transform, the cross 
power spectrum, are also widely used in radio astronomy. 

We have so far been dealing with random processes that are a function of time alone. 
The signal received from a distant cosmic source is in general a function both of the 
receivers location as well as of time. Much as we defined temporal correlation functions 
above, one can also define spatial correlation functions. If the signal at the observer’s 
plane at any instant is E(r), then spatial correlation function is defined as: 

V(x) = (E(r)E*(r + x)) 

Note that strictly speaking the angular brackets imply ensemble averaging. In practice 
one averages over time 3 and assumes that the two averaging procedures are equivalent. 
The function V is referred to as the “visibility function” (or just the “visibility”) and as we 
shall see below, it is of fundamental interest in interferometry. 


2.4 Interferometry 

2.4.1 The Need for Interferometry 

The idea that the resolution of optical instruments is limited due to the wave nature of 
light is familiar to students of optics and is embodied in the Rayleigh’s criterion which 
states that the angular resolution of a telescope/microscope is ultimately diffraction lim¬ 
ited and is given by 

d~\/D (2.4.3) 

where D is some measure of the aperture size. The need for higher angular resolution 
has led to the development of instruments with larger size and which operate at smaller 
wavelengths. In radioastronomy, the wavelengths are so large that even though the sizes 
of radio telescopes are large, the angular resolution is still poor compared to optical in¬ 
struments. Thus while the human eye has a diffraction limit of ~ 20 and even modest 
optical telescopes have diffraction limits 4 of 0.1 , even the largest radio telescopes (300m 
in diameter) have angular resolutions of only ~ 10 at 1 metre wavelength. To achieve 
higher resolutions one has to either increase the diameter of the telescope further (which 
is not practical) or decrease the observing wavelength. The second option has led to a 
tendency for radio telescopes to operate at centimetre and millimetre wavelengths, which 
leads to high angular resolutions. These telescopes are however restricted to studying 
sources that are bright at cm and mm wavelengths. To achieve high angular resolutions 
at metre wavelengths one need telescopes with apertures that are hundreds of kilome¬ 
ters in size. Single telescopes of this size are clearly impossible to build. Instead radio 
astronomers achieve such angular resolutions using a technique called aperture synthe¬ 
sis. Aperture synthesis is based on interferometry, the principles of which are fa mi liar 
to most physics students. There is in fact a deep analogy between the double slit experi¬ 
ment with quasi-monochromatic light and the radio two element interferometer. Instead 
of setting up this analogy we choose the more common route to radio interferometry via 
the van Cittert-Zernike theorem. 


3 For typical radio receiver bandwidths of a few MHz, the coherence time is of the order of micro seconds, so 
in a few seconds time one gets several million independent samples to average over. 

4 The actual resolution achieved by these telescopes is however usually limited by atmospheric seeing. 
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2.4.2 The Van Cittert Zernike Theorem 

The van Cittert-Zernike theorem relates the spatial coherence function V(ri, r 2 ) = (E(ri)P* (r 2 j) 
to the distribution of intensity of the incoming radiation, X(s). It shows that the spatial 
correlation function V(ri, r 2 ) depends only on ri — r 2 and that if all the measurements are 
in a plane, then 


F(n,r 2 )=^{J(s)} (2.4.4) 

where T implies taking the Fourier transform. Proof of the van Cittert-Zernike theorem 
can be found in a number of textbooks, eg. “Optics” by Born and Wolf, “Statistical Optics” 
by Goodman, “Interferometry and Synthesis in radio astronomy” by Thompson et al. We 
give here only a rough proof to illustrate the basic ideas. 

Let us assume that the source is distant and can be approximated as a brightness 
distribution on the celestial sphere of radius R (see Figure 2.2). Let the electric field 5 at a 
point P{(x\. y \, z [) at the source be given by £(P[). The field E(I\) at the observation point 
Pi (xi , yi , zx ) is given by 6 



r p -ikD(Pl,P 1 ) 

E(Pi) = J £(P[) D{ p iiPl) d (2.4.5) 

5 We assume here for the moment that the electric field Is a scalar quantity. See Chapter 15 for the extension 
to vector fields. 

6 Where we have invoked Huygens principle. A more rigorous proof would use scalar diffraction theory. 
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where D(P[, Pi) is the distance between P[ and Pi. Similarly if E(P 2 ) is the field at some 
other observing point P 2 (x 2 , j/ 2 , z 2 ) then the cross-correlation between these two fields is 
given by 


(E(Pi)E* (P 2 )) = j (£(P[)£*(P')) 


e -ife[D(P 1 ',P 1 )-£)(P',P 2 )] 

D(P{,Pi)D(P',P 2 ) 


d£lid£l 2 


(2.4.6) 


If we further assume that the emission from the source is spatially incoherent, i.e. 
that (£(P[)£* (P 2 )) = 0 except when P[ = Pf 2 . then we have 


(E(Pi)E*(P 2 )) 



e -ife[D(P(,Pi)-D(P(,P 2 )] 

D(P[,P 1 )D(P(,P 2 ) 


(2.4.7) 


where 1(P{) is the intensity at the point P(. Since we have assumed that the source 
can be approximated as lying on a celestial sphere of radius R we have x[ = Rcos(9 x ) = Rl, 
y[ = Rcos(9 y ) = Rm, and z[ = Rcos(9 z ) - Rn\ ( l,m,n ) are called “direction cosines”. It can 
be easily shown 7 that l 2 + m 2 + n 2 = 1 and that dQ. = , = = ... . We then have: 


D(P[,Pi) 


[K - xi) 2 + (y[ - yi) 2 + {z[ - zi) 2 ] 1/2 
[(Rl - xi) 2 + (Rm - yi) 2 + (Rn - zi) 21 \ 

R[(l - xi/R) 2 + (to - yi/R) 2 + (n - zi/R) 2 ] 1/2 

R[(l 2 + to 2 + n 2 ) — 2/R(lxi + my i + nzi)\ 1 ^ 

R — (lx i + my i + nzi) 


(2.4.8) 

(2.4.9) 

(2.4.10) 

(2.4.11) 

(2.4.12) 

(2.4.13) 


Putting this back into equation 2.4.7 we get 

1 r (ii (i™ 

(E(Pi)E*(P 2 )) = — / 1(1, m) e - ik[l(x2 -^)+m(y 2 -y 1 )+n(z 2 -z 1 )] - (2.4. 14) 

R 2 J v 1 — l 2 — m 2 

Note that since l 2 + to 2 + n 2 = 1, the two directions cosines (l,m) are sufficient to 
uniquely specify any given point on the celestial sphere, which is why the intensity 1 has 
been written out as a function of (l,m) only. It is customary to measure distances in the 
observing plane in units of the wavelength A, and to define “baseline co-ordinates” u, v, w 
such that u — (x 2 — xi)/X, v = (y 2 — yi)/\, and w = (z 2 — zi)/X. The spatial correlation func¬ 
tion (PIP )E'"(P 2 )) is also referred to as the “visibility” V(u,v,w). Apart from the constant 
factor 1 / R 2 (which we will ignore hence forth) equation 2.4.14 can then be written as 


V(u, v, w) 


1 ( 1 , to ) 


—i 27 r [lu-\-mv-\-nw] 


dl dm 


\/l — l 2 — TO 2 


(2.4.15) 


This fundamental relationship between the visibility and the source intensity distribu¬ 
tion is the basis of radio interferometry. In the optical literature this relationship is also 
referred to as the van Cittert-Zernike theorum. 

Equation 2.4.15 resembles a Fourier transform. There are two situations in which it 
does reduce to a Fourier transform. The first is when the observations are confined to a 
the U - V plane, i.e. when w = 0. In this case we have 

V(u,v)= / I{LVI) =e~ i27r[lu+rnv] dl dm (2.4.16) 

J yl — l 2 — to 2 

7 see for example, Christiansen & Hogbom. “Radio telescopes”, Cambridge University Press 
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i.e. the visibility V(u, v) is the Fourier transform of the modified brightness distribution 
,^ 1^2 • The second situation is when the source brightness distribution is limited to 
a small region of the sky. This is a good approximation for arrays of parabolic antennas 
because each antenna responds only to sources which lie within its primary beam (see 
Chapter 3). The primary beam is typically <1°, which is a very small area of sky. In this 
case n = y/1 — l 2 — m 2 ~ 1. Equation 2.4.15 then becomes 

V(u, v, w) = e~ i2 ™ J 1(1, m)e- i2 ” [lu+mv] dl dm (2.4.17) 

or if we define a modified visibility V(u, v) = V(u, v, w)e l ' 2 ~ w we have 

V(u,v) = jl(l,m)e- i27r[lu+mv] dl dm (2.4.18) 

2.4.3 Aperture Synthesis 

As we saw in the previous section, the spatial correlation of the electric field in the U-V 
plane is related to the source brightness distribution. Further, for the typical radio array 
the relationship between the measured visibility and the source brightness distribution 
is a simple Fourier transform. Correlation of the voltages from any two radio antennas 
then allows the measurement of a single Fourier component of the source brightness dis¬ 
tribution. Given sufficient number of measurements the source brightness distribution 
can then be obtained by Fourier inversion. The derived image of the sky is usually called 
a “map” in radio astronomy, and the process of producing the image from the visibilities 
is called “mapping”. 

The radio sky (apart from a few rare sources) does not vary 8 . This means that it is 
not necessary to measure all the Fourier components simultaneously. Thus for example 
one can imagine measuring all required Fourier components with just two antennas, (one 
of which is mobile), by laboriously moving the second antenna from place to place. This 
method of gradually building up all the required Fourier components and using them 
to image the source is called “aperture synthesis”. If for example one has measured all 
Fourier components up to a baseline length of say 25 km, then one could obtain an 
image of the sky with the same resolution as that of a telescope of aperture size 25 km, 
i.e. one has synthesized a 25 km aperture. In practice one can use the fact that the Earth 
rotates to sample the U-V plane quite rapidly. As seen from a distant cosmic source, the 
baseline vector between two antennas on the Earth is continuously changing because 
of the Earth’s rotation (see Figure 2.3). Or equivalently, as the source rises and sets 
the Fourier components measured by a given pair of antennas is continuously changing. 
If one has an array of N antennas spread on the Earth’s surface, then at any given 
instant one measures N C 2 Fourier components (or in radio astronomy jargon one has 
n C 2 samples in the U-V plane). As the Earth rotates one samples more and more of the 
U-V plane. For arrays like the GMRT with 30 antennas, if one tracks a source from rise 
to set, the sampling of the U-V plane is sufficiently dense to allow extremely high fidelity 
reconstructions of even complex sources. This technique of using the Earth’s rotation to 
improve “U-V coverage” was traditionally called “Earth rotation aperture synthesis”, but 
in modern usage is usually also simply referred to as “aperture synthesis”. 

From the inverse relationship of Fourier conjugate variables it follows that short base¬ 
lines are sensitive to large angular structures in the source and that long baselines are 

8 Or, in the terminology of random processes cosmic radio signals are stationary, i.e. their statistical proper¬ 
ties like the mean, auto and cross-correlation functions etc. are independent of the absolute time. 





Figure 2.3: The track in the U-V plane traced out by an east-west baseline due to the 
Earth’s rotation. 


sensitive to fine scale structure. To image large, smooth sources one would hence like 
an array with the antennas closely packed together, while for a source with consider¬ 
able fine scale structure one needs antennas spread out to large distances. The array 
configuration hence has a major influence on the kind of sources that can be imaged. 
The GMRT array configuration consists of a combination of a central lxl km cluster of 
densely packed antennas and three 14 km long arms along which the remaining anten¬ 
nas are spread out. This gives a combination of both short and long spacings, and gives 
considerable flexibility in the kind of sources that can be imaged. Arrays like the VIA on 
the other hand have all their antennas mounted on rails, allowing even more flexibility in 
determining how the U-V plane is sampled. 


Other chapters in these notes discuss the practical details of aperture synthesis. 
Chapter 3 discusses how one can use radio antennas and receivers to measure the 
electric field from cosmic sources. For an N antenna array one needs to measure v C-> 
correlations simultaneously, this is done by a (usually digital) machine called the “cor¬ 
relator”. The spatial correlation that one needs to measure (see equation 2.4.6) is the 
correlation between the instantaneous fields at points Pi and P 2 . In an interferometer 
the signals from antennas at points Pi and P 2 are transported by cable to some central 
location where the correlator is - this means that the correlator has also to continuously 
adjust the delays of the signals from different antennas before correlating them. This 
and other corrections that need to be made are discussed in Chapter 4, and exactly how 
these corrections are implemented in the correlator are discussed in Chapters 8 and 9. 
The astronomical calibration of the measured visibilities is discussed in Chapter 5, while 
Chapter 16 deals with the various ways in which passage through the Earth’s ionosphere 
corrupts the astronomical signal. Chapters 10, 12 and 14 discuss the nitty gritty of go¬ 
ing from the calibrated visibilities to the image of the sky. Chapters 13 and 15 discuss 
two refinements, viz. measuring the spectra and polarization properties of the sources 
respectively. 
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2.5 The Fourier Transform 

The Fourier transform U(v) of a function u(t) is defined as 

/ OO 

u(t)e- l27Iut dt 

-OO 

and can be shown to exist for any function u(t) for which 

/ OO 

\u(t)\dt < oo 

-OO 

The Fourier transform is invertible, i.e. given U{v), u{t) can be obtained using the inverse 
Fourier transform, viz. 


/ OO 

u{vy 2irvt dv 

-oo 


Some important properties of the Fourier transform are listed below (where by con¬ 
vention capitalized functions refer to the Fourier transform) 

1. Linearity 

P{au(t) + bv(t)} = aU(v) + bV{y) 

where a, b are arbitrary complex constants. 

2. Similarity 

P{u(at)} = -[/( — ) 
a a 

where a is an arbitrary real constant. 

3. Shift 

P{u(t — a)} = e~ l2na U(iy) 

where a is an arbitrary real constant. 

4. Parseval’s Theorem 

/ OO nOO 

\u(t)\ 2 dt = / \U(v)\ 2 dv 

-oo J —OO 


5. Convolution Theorem 


6. Autocorrelation Theorem 


/ OO 

u(t)v(t — r)dt = U{v)V(y) 

-OO 


/ OO 

u(t)u(t + r)dt = \U(v)\^ 

-OO 

Some commonly used Fourier transform pairs are: 


Table 2.1: Fourier transform pairs 


Function 

Transform 

_7 rt* 

„7 TV* 

e 

e 

1 

8(v) 

COs(7rf) 

h) + ^(" + h) 

sin(7rf) 

|) - %6(v + |) 

rect(t) 

sinc(is) 



Chapter 3 

Single Dish Radio Telescopes 


Jayaram N. Chengalur 


3.1 Introduction 

As a preliminary to describing radio telescopes, it is useful to have a look at the trans¬ 
parency of the atmosphere to electro-magnetic waves of different frequencies. Figure 3.1 
is a plot of the height in the atmosphere at which the radiation is attenuated by a factor of 
2 as a function of frequency. There are only two bands at which radiation from outerspace 
can reach the surface of the Earth, one from 3000 A to 10000 A - the optical/near-infrared 
window, and one from a few mm to tens of meters - the radio window. Radio waves longer 
than a few tens of meters get absorbed in the ionosphere, and those shorter than a few 
mm are strongly absorbed by water vapor. Since mm wave absorption is such a strong 
function of the amount of water vapour in the air, mm wave telescopes are usually located 
on high mountains in desert regions. 

The optical window extends about a factor of ~ 3 in wavelength, whereas the radio 
window extends almost a factor of ~ 10 4 in wavelength. Hence while all optical telescopes 
‘look similar’, radio telescopes at long wavelengths have little resemblance to radio tele¬ 
scopes at short wavelengths. At long wavelengths, radio telescopes usually consist of 
arrays of resonant structures, like dipole or helical antennas (Figure 3.2). At short wave¬ 
lengths reflecting telescopes (usually parabolic antennas, which focus incoming energy 
on to the focus, where it is absorbed by a small feed antenna) are used (Figure 3.3). 

Apart from this difference in morphology of antennas, the principal difference between 
radio and optical telescopes is the use of coherent (i.e. with the preservation of phase 
information) amplifiers in radio astronomy. The block diagram for a typical single dish 
radio astronomy telescope is shown in Figure 3.4. Radio waves from the distant cosmic 
source impinge on the antenna and create a fluctuating voltage at the antenna terminals. 
This voltage varies at the same frequency as the cosmic electro-magnetic wave, referred 
to as the Radio Frequency (RF). The voltage is first amplified by the front-end (or Radio 
Frequency) amplifier. The signal is weakest here, and hence it is very important that the 
amplifier introduce as little noise as possible. Front end amplifiers hence usually use 
low noise solid state devices, High Electron Mobility Transistors (HEMTs), often cooled to 
liquid helium temperatures. 

After amplification, the signal is passed into a mixer. A mixer is a device that changes 
the frequency of the input signal. Mixers have two inputs, one for the signal whose fre¬ 
quency is to be changed (the RF signal in this case), the other input is usually a pure sine 
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Wavelength A 

Figure 3.1: The height above the Earth’s surface where cosmic electro-magnetic radiation 
is attenuated by a factor of two. There are two clear windows the optical (V) (~ 4000 — 
10000 A) and the radio ~ 1mm — 10m. In addition there are a few narrow windows in 
the infra-red (IR) wavelength range. At all other wavelengths astronomy is possible only 
through satellites. 

wave generated by a tunable signal generator, the Local Oscillator (LO). The output of the 
mixer is at the beat frequency of the radio frequency and the local oscillator frequency. 
So after mixing, the signal is now at a different (and usually lower) frequency than the RF, 
this frequency is called the Intermediate Frequency (IF). The main reason for mixing 
(also called heterodyning) is that though most radio telescopes operate at a wide range of 
radio frequencies, the processing required at each of these frequencies is identical. The 
economical solution is to convert each of these incoming radio frequencies to a standard 
IF and then to use the exact same back-end equipment for all possible RF frequencies 
that the telescope accepts. In telescopes that use co-axial cables to transport the signal 
across long distances, the IF frequency is also usually chosen so as to minimize trans¬ 
mission loss in the cable. Sometimes there is more than one mixer along the signal path, 
creating a series of IF frequencies, one of which is optimum for signal transport, another 
which is optimum for amplification etc. This is called a ‘super-heterodyne’ system. For 
example, the GMRT (see Chapter 21) accepts radio waves in six bands from 50 MHz to 
1.4 GHz and has IFs at 130 MHz, 175 MHz and 70 MHz 1 . 

^here are IFs at 130 MHz and 175 MHz to allow the signals from the two different polarizations received 
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Figure 3.2: The Mauritius Radio Telescope. This is a low frequency (150 MHz) array of 
which the individual elements are helical antennas. 


After conversion to IF, the signal is once again amplified (by the IF amplifier), and 
then mixed to a frequency range near 0 Hz (the Base Band (BB) and then fed into the 
backend for further specialized processing. What backend is used depends on the nature 
of the observations. If what you want to measure is simply the total power that the 
telescope receives then the backend could be a simple square law detector followed by 
an integrator. (Remember the signal is a voltage that is proportional to amplitude of the 
electric field of the incoming wave, and since the power in the wave goes like the square 
of its amplitude, the square of the voltage is a measure of the strength of the cosmic 
source). The integrator merely averages the signal to improve the signal to noise ratio. 
For spectral line observations the signal is passed into a spectrometer instead of a broad 
band detector. For pulsar observations the signal is passed into special purpose ‘pulsar 
machines’. Spectrometers (usually implemented as “correlators”) and pulsar machines 
are fairly complex and will not be discussed further here (see instead Chapters 8 and 17 
more more details). The rest of this chapter discusses only the first part of this block 
diagram, viz. the antenna itself. 


3.2 EM Wave Basics 

A cosmic source typically emits radio waves over a wide range of frequencies, but the 
radio telescope is sensitive to only a narrow band of emission centered on the RF. We 
can hence, to zeroth order, approximate this narrow band emission as a monochromatic 
wave. (More realistic approximations are discussed in Chapter 15). The waves leaving 
the cosmic source have spherical wavefronts which propagate away from the source at 
the speed of light. Since most sources of interest are very far away from the Earth, the 
radio telescope only sees a very small part of this spherical wave front, which can be well 
approximated by a plane wave front. Electro-magnetic waves are vector waves, i.e. the 


by the antenna to be frequency division multiplexed onto the same optical fiber for transport to the central 
electronics building. 
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Figure 3.3: The Caltech Sub-millimeter Observatory (CSO) at Mauna Kea in Hawaii. The 
telescope operates in the the sub-mm wavlength range. 


electric field has a direction as well as an amplitude. In free space, the electric field of 
a plane wave is constrained to be perpendicular to its direction of propagation and the 
power carried by the wave is proportional to the square of the amplitude of the electric 
field. 

Consider a plane EM wave of frequency v propagating along the Z axis (Figure 3.6). 
The electric field then can have only two components, one along the X axis, and one along 
the Y axis. Since the wave is varying with a frequency v, each of these components also 
varies with a frequency u, and at any one point in space the electric field vector will also 
vary with a frequency v. The polarization of the wave characterizes how the direction of 
the electric field vector varies at a given point in space as a function of time. 

The most general expression for each of the components of the electric field of a plane 
monochromatic wave 2 is: 


Ex = Ax cos(27 jvt + Sx) 

Ey = Ay COs(27Tl4 + 5y ) 

where A x , Ay, Sx, Sy are constants. If Ay = 0, then the field only has one component 
along the X axis, which increases in amplitude from — A x to +A X and back to —A x over 
one period. Such a wave is said to be linearly polarized along the X axis. Similarly if A x 
is zero then the wave is linearly polarized along the Y axis. Waves which are generated by 
dipole antennas are linearly polarized along the length of the dipole. Now consider a wave 
for which A x = A Y ,S x = 0, and Sy = -tt/2. If we start at a time at which the X component 
is a maximum, then the Y component is zero and the total field points along the +X axis. 
A quarter period later, the X component will be zero and the Y component will be at 
maximum, the total field points along the +Y direction. Another quarter of a period later, 
the Y component is again zero, and the X component is at minimum, the total field points 

2 Monochromatic waves are necessarily 100% polarized. As discussed in Chapter 15 quasi-monochromatic 
waves can be partially polarized. 
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Figure 3.4: Block diagram of a single dish radio telescope. 



Figure 3.5: One of the 30 GMRT antennas 
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along the -X direction. Thus over one period, the tip of the electric field vector describes a 
circle in the XY plane. Such a wave is called circularly polarized. If 5y were = ir/2 then 
the electric field vector would still describe a circle in the XY plane, but it would rotate 
in the opposite direction. The former is called Right Circular Polarization (RCP) and 
the latter Left Circular Polarization (LCP). 3 Waves generated by Helical antennas are 
circularly polarized. In the general case when all the constants have arbitrary values, the 
tip of the electric wave describes an ellipse in the XY plane, and the wave is said to be 
elliptically polarized. 

Any monochromatic wave can be decomposed into the sum of two orthogonal polar¬ 
izations. What we did above was to decompose a circularly polarized wave into the sum 
of two linearly polarized components. One could also decompose a linearly polarized 
wave into the sum of LCP and RCP waves, with the same amplitude and 7 r radians out of 
phase. Any antenna is sensitive to only one polarization (for example a dipole antenna 
only absorbs waves with electric field along the axis of the dipole, while a helical antenna 
will accept only one sense of circular polarization). Note that the reflecting surface of a 
telescope could well 4 work for both polarizations, but the feed antenna will respond to 
only one polarization. To detect both polarizations one need to put two feeds (which could 
possibly be combined into one mechanical structure) at the focus. Each feed will require 
its own set of electronics like amplifiers and mixers etc. 

EM waves are usually described by writing explicitly how the electric field strength 
varies in space and time. For example, a plane wave of frequency v and wave number k 
(k = 27t/A, A = c/u) propagating along the Z axis and linearly polarized along the X axis 
could be described as 

E(z,t) = A cos(27 xvt — kz) 

3 This RCP-LCP convention is unfortunately not fixed, and the reverse convention is also occasionally used, 
leading to endless confusion. It turns out however, that most cosmic sources have very little circular polariza¬ 
tion. 

4 Not all reflecting radio telescopes have surfaces that reflect both polarizations. For example, the Ooty radio 
telescope’s (Figure 3.16) reflecting surface consists of a parallel set of thin stainless steel wires, which only 
reflect the polarization with the electric field parallel to the wires. 
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This could also be written as 


E(z,t) = Real (Ae j(2 "^- kz) ) 

where Real() implies taking the real part of () and j is the imaginary square root of —1. 
Since all the time variation is at the same frequency v, one could suppress writing it out 
explicitly and introduce it only when one needs to deal with physical quantities. So, one 
could equally well describe the wave by the complex quantity A, where A = A e~ jkz , and 
understand that the physical field is obtained by multiplying A by e :>2 ~ lJt and taking the 
real part of the product. The field A is called the phasor field 5 . So for example the phasor 
field of the wave 

E = A cos(27 n't — kz + 5) 

is simply Ae J<5 . 


3.3 Signals and Noise in Radio Astronomy 

3.3.1 Signals 

At radio frequencies, cosmic source strengths are usually measured in Janskys 6 (Jy). 
Consider a plane wave from a distant point source falling on the Earth. If the energy per 
unit frequency passing through an area of 1 square meter held perpendicular to the line 
of sight to the source is 10~ 26 watts then the source is said to have a brightness of 1 Jy, 
i.e. 


1 Jy = 1CT 26 W/m 2 /Hz, 

For an extended source, there is no longer a unique direction to hold the square 
meter, such sources are hence characterized by a sky brightness B, the energy flow at 
Earth per unit area, per unit time, per unit solid angle, per unit Frequency, i.e. the units 
ofbrightness are W/m 2 /Hz/sr. 

Very often the sky brightness is also measured in temperature units. To motivate 
these units, consider a black body at temperature T. The radiation from the black body 
is described by the Planck spectrum 

B M = — ehv/k T_ l W/m 2 /Hz/sr 

i.e. the same units as the brightness. For a typical radio frequency of 1000 MHz, hv/k = 
0.048, hence 

e hu/kT ~ 1 + hv / kT 

and 

9 w 2 

B{v) ~ —kT = 2kT/X 2 
c 2 

This approximation to the Planck spectrum is called the Rayleigh-Jeans approxima¬ 
tion, and is valid through most of the radio regime. From the R-J approximation, 

\2 

T = —B(v) 

2k y ’ 

5 For qasi monochromatic waves, (see Chapter 1), one has the related concept of the complex analytical signal 

6 As befitting Its relative youth, this Is a linear, MKS based scale. At most other wavelengths, the brightness 
Is traditionally measured In units far too Idiosyncratic to be described In this footnote. 
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In analogy, the brightness temperature T B of an extended source is defined as 

Tb = Tk B W- 

where B(u) is the sky brightness of the source. Note that in general the brightness 
temperature T B has no relation to the physical temperature of the source. 

For certain sources, like the quiet sun and HII regions, the emission mechanism 
is thermal bremstrahlung. and for these sources, provided the optical depth is large 
enough, the observed spectrum will be the Rayleigh-Jeans tail of the black body spec¬ 
trum. In this case, the brightness temperature is a directly related to the physical tem¬ 
perature of the electrons in the source. Sources for which the synchrotron emission 
mechanism dominates, the spectrum is not black-body, but is usually what is called 
steep spectmm 7 , i.e. the flux increases sharply with increasing wavelength. At low fre¬ 
quencies, the most prominent such source is the Galactic non-thermal continuum, for 
which the flux S oc u~ a , a ~ 1. At low frequencies hence, the sky brightness temperature 
dominates the system temperature 8 . Pulsars and extended extra-galactic radio sources 
too in general have steep spectra and are brightest at low frequencies. At the extreme end 
of the brightness temperature are masers where a lot of energy is pumped out in a narrow 
collimated molecular line, the brightness temperatures could reach ~ 10 12 K. This could 
certainly not be the physical temperature of the source since the molecules disintegrate 
at temperatures well below 10 12 K. 

3.3.2 Noise 

An antenna absorbs power from the radio waves that fall on it. This power is also usually 
specified in temperature units, i.e. degrees Kelvin. To motivate these units, consider 
a resistor placed in a thermal bath at a temperature T. The electrons in the resistor 
undergo random thermal motion, and this random motion causes a current to flow in the 
resistor. On the average there are as many electrons moving in one direction as in the 
opposite direction, and the average current is zero. The power in the resistor however 
depends on the square of the current and is not zero. From the equipartition principle 
one could compute this power as a function of the temperature, and in the radio regime 
the power per unit frequency is well approximated by the Nyquist formula: 

P = kT, 

where k is the same Boltzmann constant as in the Planck law. In analogy with this, if a 
power P (per unit frequency) is available at an antenna’s terminals the antenna is defined 
to have an antenna temperature of 



Note again that the antenna temperature does not correspond to the physical temperature 
of the antenna. Similarly the total power available at a radio telescope terminals, referred 
to the receiver (i.e. the RF amplifier) inputs 9 * * is defined as the system temperature T sys , 
i.e. 

Total Power referred to receiver inputs 
T sy s = ^ 

7 provided that the source Is optically thin 

8 See the discussion on system temperature later in this section 

9 By ‘referred to the reciever inputs’ we mean the following. Suppose you have a noise power P at the output 

of the radio telescope. If there is only one stage of amplification with gain G, then the power referred to the 

inputs is P/G. If there is more than one stage of amplification, one has to rescale each noise source along the 

signal path by the gain of all the preceeding amplifiers. 
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Figure 3.7: The Arecibo telescope consists of a large (300 m) spherical reflector fitted into 
a naturally occuring valley. The telescope has feeds which are suspended from cables 
which originate from towers on the surrounding hills. Photo courtesy of NAIC, Arecibo 
observatory. 


The system temperature when looking at blank sky is a measure of the total random 
noise in the system and hence it is desirable to make the system temperature as low 
as possible. Noise from the various sub systems that make up the radio telescope are 
uncorrelated and hence add up linearly. The system temperature can be very generally 
written as 


T — T 

sys — -*■ ‘ 


sky 


■T. 


spill 


T lo 


T 

1 n 


T sky is the contribution of the background sky brightness. For example the galaxy is 
a strong emitter of non thermal 10 continum radiation, which at low frequencies usually 
dominates the system temperature. At all frequencies the sky contributes at least 3K 
from the cosmic background radiation. 11 

The feed antenna is supposed to collect the radiation focused by the reflector. Often 
the feed antenna also picks up stray radiation from the ground ( which radiates approx¬ 
imately like a black body at 300 K ) around the edge of the reflector. This added noise 
is called spillover noise, and is a very important contribution to the system temperature 
at a telescope like Arecibo. In Figure 3.8 is shown (schematically) the system temper¬ 
ature for the (pre-upgrade) Arecibo telescope at 12cm as a function of the zenith angle 
at which the telescope is pointed. At high zenith angles the feed radiation spills over 
the edge of the dish and picks up a lot of radiation from the surrounding hills and the 

10 By non thermal radiation one means simply that the source function Is not the Planck spectrum. 

11 Historicaly, this fact was discovered by Penzias and Wilson when they set out to perform the relatively 
mundane task of calibrating the system temperature of their radio telescope. This excess 3K discovered to 
come from the sky was identified with the radiation from the Big Bang, and was one of the powerful pieces 
of evidence in favour of the Big Bang model. The field of Radio Astronomy itself was started by Karl Jansky, 
who too was engaged in the task of calibrating the system temperature of his antenna (he had been set the 
task of characterizing the various kinds of noise which radio receivers picked up, this noise was harmful to 
trans-atlantic communication, and was hence essential to understand). Jansky discovered that one component 
of the 'radio noise’ was associated with the Galactic center, the first detection of extra-terrestrial radio waves. 
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Zenith Angle (deg) 


Figure 3.8: Schematic of the variation of T sys with zenith angle for the pre-upgrade 
Arecibo. 


system temperature changes from under 40 K to over 80 K. If a reflecting screen were to 
be placed around the telescope edges, then, the spill over radiation will be sky radiation 
reflected by the screen, and not thermal radiation from the ground. At cm wavelengths, 
T s ky « Tg round , so such a ground screen would significantly reduce the system tempera¬ 
ture at high zenith angles 12 . 

Any lossy element in the feed path will also contribute noise (X) oss ) to the system. This 
follows from Kirchoff s law which states that good absorbers are also good emitters, and 
that the ratio of emission to absorption in thermodynamic equilibrium is given by the 
Planck spectrum at the absorber’s physical temperature. This is the reason why there 
are rarely any uncooled elements between the feed and the first amplifier. Finally, the 
receiver also adds noise to the system, which is characterized by T rec . The noise added 
after the first few stages of amplification is usually an insignificant fraction of the signal 
strength and can often be ignored. 

The final, increasingly important contributor to the system temperature is terrestrial 
interference. If the bandwidth of the interference is large compared to the spectral resolu¬ 
tion, the interference is called broad band. Steady, broad band interference increases the 
system temperature, and provided this increase is small its effects are relatively benign. 
However, typically interference varies on a very rapid time scale, causing a rapid fluctu¬ 
ation in the system temperature. This is considerably more harmful, since such fluctu¬ 
ations could have harmonics which are mistaken for pulsars etc. In aperture synthesis 
telescopes such time varying effects will also produce artifacts in the resulting image 13 . 
Interference whose bandwidth is small compared to the spectral resolution is called nar¬ 
row band interference. Such interference, provided it is weak enough will corrupt only 
one spectral channel in the receiver. Provided this spectral channel is not important (i.e. 
does not coincide with for eg. a spectral line from the source) it can be flagged with little 

12 As can be seen from Figure 3.7, such a screen has indeed been built, and it has dramatically reduced the 
system temperature at high zenith angles. The wire mesh for this screen was produced, with the co-ordination 
of NCRA by the same contractor who fabricated the mesh for the GMRT antennas, and was exported to the 
USA. 

13 It is often claimed that interferometers are immune from interference because different antennas “see” 
different interfering sources and these do not correlate with one another. However since the interference is 
typically varying on timescales faster than the system temperature is calibrated, the resulting variations in 
the system temperatures of the different antennas cause variations in the observed correlation coefficent (for 
telescopes which do a continuous normalization by the auto-correlation of each antenna's signal) and hence 
artifacts in the image plane. 




3.3. SIGNALS AND NOISE IN RADIO ASTRONOMY 


11 


loss of information. However, if the interference is strong enough, the receiver saturates, 
which has several deleterious effects. Firstly since the receiver is no longer in its linear 
range, the increase in antenna temperature on looking at a cosmic source is no longer 
simply related to the source brightness, making it difficult, and usually impossible to 
derive the actual source brightness. This is called compression. Further if some other 
spectral feature is present, perhaps even a spectral line from the source, spurious signals 
are produced at the beat frequencies of the true spectral line and the interference. These 
are called intermodulation products. Given the increasingly hostile interference envi¬ 
ronment at low frequencies, it is important to have receivers with large dynamic range, 
i.e. whose region of linear response is as large as possible. It could often be the case, that 
it is worth increasing the receiver temperature provided that one gains in dynamic range. 
For particularly strong and steady sources of interference (such as carriers for nearby TV 
stations), it is usually the practice to block such signals out using narrow band filters 
before the first amplifier 14 . 

3.3.3 Signal to Noise Ratio 

Since the signals 15 in a radio telescope are random in nature, the output of a total power 
detector attached to a radio telescope too will show random fluctuations. Supposing a 
telescope with system temperature T sys , gain G, and bandwidth A v is used to try and 
detect some astrophysical source. The strategy one could follow is to first look at a 
‘blank’ part of the sky, and then switch to a region containing the source. Clearly if the 
received power increases, then one has detected radio waves from this source 16 . But 
given that the output even on a blank region of sky is fluctuating, how can one be sure 
that the increase in antenna temperature is not a random fluctuation but is indeed due 
to the astrophysical source? In order to make this decision, one needs to know what 
the rms is in the fluctuations. It will be shown later 17 , that for a total power detector 
with instantaneous rms T sys , the rms after integrating a signal of bandwidth Ais Hz for 
r seconds is 18 T sys /-\/A vt. The increase in system temperature is just GS, where S is the 
flux density of the source. The signal to noise ratio is hence 

GSa/A^ 
snr = —-- 

1 sys 

This is the fundamental equation for the sensitivity of a single dish telescope. Provided 
the signal to noise ratio is sufficiently large, one can be confident of having detected the 
source. 

The signal to noise ratio here considers only the ‘thermal noise’, i.e. the noise from the 
receivers, spillover, sky temperature etc. In addition there will be random fluctuations 
from position to position as discussed below because of confusion. For most single dish 
radio telescopes, especially at low frequencies, the thermal noise reaches the confusion 
limit (see Section 3.4) in fairly short integration times. To detect even fainter sources, 
it becomes necessary then to go for higher resolution, usually attainable only through 
interferometry. 

14 Recall from the discussion above on the effect of introducing lossy elements in the signal path that the price 
one pays is precisely an increase in receiver temperature 

15 Apart from interference etc. 

16 Assuming of course that you have enough spatial resolution to make this identification 

17 Chapter 5 

18 This can be heuristically understood as follows. For a stochastic proccess of bandwidth Au, the coherence 
time is ~ 1/Au, which means that in a time of r seconds, one has Au r independent samples. The rms decreases 
as the square root of the number of independent samples. 
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3.4 Antenna Patterns 

The most Important characteristic of an antenna is its ability to absorb radio waves in¬ 
cident upon it. This is usually described in terms of its effective aperture. The effective 
aperture of an antenna is defined as 

Power density available at the antenna terminals 
e Flux density of the wave incident on the antenna 


The units are 

W/Hz 
W/m 2 / Hz 


The effective area is a function of the direction of the incident wave, because the 
antenna works better in some directions than in others. Hence 


Ae = A e (<?,</>) 


This directional property of the antenna is often described in the form of a power pattern. 
The power pattern is simply the effective area normalized to be unity at the maximum, 
i.e. 


P(M) 


A e (#,</>) 

J^rnax 


The other common way to specify the directive property of an antenna is the field pattern. 
Consider an antenna receiving radio waves from a distant point source. The voltage at the 
terminals of the antenna as a function of the direction to the point source, norm a lized 
to unity at maximum, is called the field pattern f(6, (!>) of the antenna. The pattern 
of an antenna is the same regardless of whether it is used as a transmitting antenna 
or as a receiving antenna, i.e. if it transmits efficiently in some direction, it will also 
receive efficiently in that direction. This is called Reciprocity, (or occassionaly Lorentz 
Reciprocity) and follows from Maxwell’s equations. From reciprocity it follows that the 
electric field far from a transmitting antenna, normalized to unity at maximum, is simply 
the Field pattern f{9, (j>). Since the power flow is proportional to the square of the electric 
field, the power pattern is the square of the field pattern. The power pattern is hence real 
and positive semi-definite. 

A typical power pattern is shown in Figure 3.9. The power pattern has a primary max¬ 
imum, called the main lobe and several subsidiary maxima, called side lobes. The points 
at which the main lobe falls to half its central value are called the Half Power points and 
the angular distance between these points is called the Half Power Beamwidth (HPBW). 
The minima of the power pattern are called nulls. For radio astronomical applications 
one generally wants the HPBW to be small (so that the nearby sources are not confused 
with one another), and the sidelobes to be low (to minimize stray radiation). From simple 
diffraction theory it can be shown that the HPBW of a reflecting telescope is given by 


Ghpbw ~ A /D 

where D is the physical dimension of the telescope. A and D must be measured in the 
same units and 0 is in radians. So the larger the telescope, the better the resolution. 
For example, the HPBW of a 700 foot telescope at 2380 MHz is about 2 arcmin. This is 
very poor resolution - an optical telescope (A ~ 5000 A), a few inches in diameter has a 
resolution of a few arc seconds. However, the resolution of single dish radio telescopes, 
unlike optical telescopes, is not limited by atmospheric turbulence. Figure 3.10 shows 
the power pattern of the (pre-upgrade) Arecibo telescope at 2380 MHz. Although the 
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telescope is 1000 ft in diameter, only a 700 ft diameter aperture is used at any given 
time, and the HPBW is about 2 arc min. There are two sidelobe rings, which are not quite 
azimuthally symmetric. 

There are two other patterns which are sometimes used to describe antennas. The 
first is the directivity D{9 , 6). The directivity is defined as: 


D(e,<t>) 


Power emitted into (0, (f>) 
(Total power emitted) /47 t 
47 tP( 0, (f>) 
f p{0, <t>) <m 


(3.4.1) 

(3.4.2) 

(3.4.3) 


This is the ‘transmitting’ pattern of the antenna, and from reciprocity should be the 
same as the recieving power pattern to within a constant factor. We will shortly work out 
the value of this constant. The other pattern is the gain G(9, (t>). The gain is defined as: 


G(M) 


Power emitted into (9, <j>) 
(Total power input)/47T 


(3.4.4) 


The gain is the same as the directivity, except for an efficiency factor. Finally a figure 
of merit for reflector antennas is the aperture efficiency, 77 . The aperture efficiency is 
defined as: 


77 = 


Ag 


(3.4.5) 


where A g is the geometric cross-sectional area of the main reflector. As we shall prove 
below, the aperture efficiency is at most 1 . 0 . 

Consider observing a sky brightness distribution B{9) with a telescope with a power 
pattern like that shown in Figure 3.9. The power available at the antenna terminals is 
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B(e) 



Figure 3.11: The antenna temperature is the convolution of the sky brightness and the 
telescope beam. 

the integral of the brightness in a given direction times the effective area in that direction 
(Figure 3.11). 

W{6) = ^J B(d)A e (e - 0')dd (3.4.6) 

where the available power IF is a function of (/, the direction in which the telescope is 
pointed. The factor of i is to account for the fact that only one polarization is absorbed 
by the antenna. In two dimensions, the expression for W is: 

w{e',<j>') = ^ J -e ,cf)-^)s\n{e)ded(j) ( 3 . 4 . 7 ) 

in temperature units, this becomes: 

T A (6'J) = ^ J Tb ^ ® A e (6 - 0 , $ - <j>') sin(fl)rfg# (3.4.8) 

or 

A max r 

T A (e\</>') = / T B (o,<t>)P{e -6 ,(j>-<p)sm{e)dOd<j> ( 3 . 4 . 9 ) 

So the antenna temperature is a weighted average of the sky temperature, the weight¬ 
ing function being the power pattern of the antenna. Only if the power pattern is a single 
infinitely sharp spike is the antenna temperature the same as the sky temperature. For 
all real telescopes, however, the antenna temperature is a smoothed version of the sky 
temperature. Supposing that you are making a sky survey for sources. Then a large 
increase in the antenna temperature could mean either that there is a source in the main 
beam, or that a collection of faint sources have combined to give a large total power. From 
the statistics of the distribution of sources in the sky (presuming you know it) and the 
power pattern, one could compute the probability of the latter event. This gives a lower 
limit to the weakest detectable source, below this limit,(called the confusion limit), one 
can no longer be confident that increases in the antenna temperature correspond to a 
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single source in the main beam. The confusion limit is an important parameter of any 
given telescope, it is a function of the frequency and the assumed distribution of sources. 

Now consider an antenna terminated in a resistor, with the entire system being placed 
in a black box at temperature T. After thermal equilibrium has been reached, the power 
flowing from the resistor to the antenna is: 


Pr-^a = kT 


The power flow from the antenna to the resistor is (from equation (3.4.9) and using the 
fact that the sky temperature is the same everywhere) 


Pa-^r = ( 


A™ ax kT 

A 2 


) J p(0,<f>)dn 


In thermal equilibrium the net power flow has to be zero, hence 

A 2 

a max _ ___ 

e “ / p(o,<t>)<m' 


(3.4.10) 


i.e. the ma xi mum effective aperture is determined by the shape of the power pattern 
alone. The narrower the power pattern the higher the aperture efficiency. For a reflecting 
telescope, 

J P{8,4>)dLl ~ G 2 hpbw ~ {jj) ■ 

so 

j^rnax ^ 

The max. effective aperture scales like the geometric area of the reflector, as expected. 
Also from equation 3.4.10 


A ’= A ™ p ^=f$m- 134111 

Comparing this with equation (3.4.1) gives the constant that relates the effective area to 
the directivity 

D(9,<f>) = ^MOA)- (3.4.12) 

As an application for all these formulae, consider the standard communications prob¬ 
lem of sending information from antenna 1 (gain Gi(9, </>), input power Pi) to antenna 2 
(directivity D 2 (d', <j>)}, at distance R away. What is the power available at the terminals of 
antenna 2? 

The flux density at antenna 2 is given by: 

. i.e., the power falls off like R 2 , but is not isotropically distributed. (The gain Gi tells you 
how collimated the emission from antenna 1 is). The power available at the terminals of 
antenna 2 is: 

W = MeS= 4^ Gl(M)A2e 

substituting for the effective aperture from equation (3.4.12) 

W=(^) 2 P 1 G 1 (d,cl ) )D 2 (e'A') 
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Figure 3.12: Aperture illumination for a parabolic dish. 


This is called the Friis transmission equation. In Radar astronomy, there is a very 
similar expression for the power available at an antenna after bouncing off an unresolved 
target (the radar range equation). The major difference is that the signal has to make 
a round trip, (and the target reradiates power falling on it isotropically), so the received 
power falls like the fourth power of the distance to the target. 


3.5 Computing Antenna Patterns 

The next step is to understand how to compute the power pattern of a given telescope. 
Consider a parabolic reflecting telescope being fed by a feed at the focus. The radiation 
from the feed reflects off the telescope and is beamed off into space (Figure 3.12). If 
one knew the radiation pattern of the feed, then from geometric optics (i.e. simple ray 
tracing, see Chapter 19) one could then calculate the electric field on the plane across the 
mouth of the telescope (the ‘aperture plane’). How does the field very far away from the 
telescope lookslike? If the telescope surface were infinitely large, then the electric field 
in the aperture plane is simply a plane wave, and since a plane wave remains a plane 
wave on propagation through free space, the far field is simply a plane wave traveling 
along the axis of the reflector. The power pattern is an infinitely narrow spike, zero 
everywhere except along the axis. Real telescopes are however finite in size, and this 
results in diffraction. The rigorous solution to the diffraction problem is to find the 
appropriate Green’s function for the geometry, this is often impossible in practise and 
various approximations are necessary. The most commonly used one is Kirchoff s scalar 
diffraction theory. However, for our purposes, it is more than sufficient to simply use 
Huygen’s principle. 

Huygen’s principle states that each point in a wave front can be regarded as an imag¬ 
inary source. The wave at any other point can then be computed by adding together the 
contributions from each of these point sources. For example consider a one dimensional 
aperture, of length l with the electric field distribution (‘aperture illumination’) e(x). The 
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field at a point P(R, 9) (Figure 3.13) due to a point source at a distance x from the center 
of the aperture is (if R is much greater than /) is: 


dE = 


e ( x ) 

R 2 


_ • 2'KXsinO 

e 3 x 


P 



Figure 3.13: The far-field pattern as a function of the aperture illumination. 

Where x sin 9 is simply the difference in path length between the path from the center 
of the aperture to the point P and the path from point x to point P. Since the wave from 
point x has a shorter path length, it arrives at point P at an earlier phase. The total 
electric field at P is: 

E(R,0) = [ l/2 ^e- jk » x dx 
J-l/2 R 

where k = 2* and // = sinO and x is now measured in units of wavelength. The shape of 
the distribution is clearly independent of R, and hence the unnormalized power pattern 
Fjj is just: 

/ OO 

ei (x)e~ 3kllx dx (3.5.13) 

-OO 

where 

ei(x) = e(x) if \x\ <1/2 ; 0 otherwise 

The region in which the field pattern is no longer dependent on the distance from the 
antenna is called the far field region. The integral operation in equation (3.5.13) is called 
the Fourier transform. Fu(n) is the Fourier transform of e\ (x), which is often denoted as 
Fu{n) = F [e-| (x)]. The Fourier transform has many interresting properties, some of which 
are listed below (see also Section 2.5). 

1. Linearity 

If Gi(fi) = F[< 7 i(a:)] and G 2 (/x) = F[g 2 (x)] then Gi(/x) + G 2 (/x) = F[gi(a:) + g 2 (x)\. 


2. Inverse 
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The Fourier transform is an invertible operation; if 

/ OO 

g(x)e~ j2 ^ x dx 

-OO 


then 

/ OO 

G(y)e^ x dy 

-OO 


3. Phase shift 

If G{n) = F [<?(a:)] then G(/x — /z 0 ) = F [g(x)e~ j2 ' KfloX ]. This means that an antenna beam 
can be steered across the sky simply by introducing the appropriate linear phase 
gradient in the aperture illumination. 

4. Parseval’s theorem 

If G{n) = F [g(a;)], then 

/ OO /»oo 

\G(p)\ 2 dp= / \g(x)\ 2 dx 

-oo J —OO 

This is merely a restatement of power conservation. The LHS is the power outflow 
from the antenna as measured in the far field region, the RHS is the power outflow 
from the antenna as measured at the aperture plane. 


5. Area 

If G(/i) = F [g(x)], then 


/ OO 

g(x)dx 

-OO 


With this background we are now in a position to determine the ma xi mum effective 
aperture of a reflecting telescope. For a 2D aperture with aperture illumination g(x,y), 
from equation (3.4.10) 



fP(0,</>)dQ f \F(0, c/))\ 2 dfl 


(3.5.14) 


But the field pattern is just the normalized far field electric field strength, i.e. 


F{9,4>) 


£(M) 

£(o,o) 


where E(0,4>) = F [g(x, y)]. From property (5) 


£(0,0) = J g(x, y)dxdy' 

and from Parseval’s theorem, 

J \E(0,4>)\ 2 dPl = J \g(x,y)\ 2 dxdy 

substituting in equation (3.5.14) using equations (3.5.15), 3.5.16 gives, 


(3.5.15) 


(3.5.16) 


A 2 1 / g{x, y)dxdy 
f \g(x,y)\ 2 dxdy 
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For uniform illumination 


A 2 



Note that since x and y are in units of wavelength, so is A g . A'" lax however is in 
physical units. Uniform illumination gives the maximum possible aperture efficiency (i.e. 
1), because if the illumination is tapered then the entire available aperture is not being 
used. 

As a concrete example, consider a ID uniformly illuminated aperture of length /. The 
far field is then 

A/2 

x / _ 32 ttxh 

E(/i) = I e * dx 

J-l/2 


and the normalized field pattern is 


Asin(7r//A^i) 

TTfl 


sm(nl/Xn) 

m = ISTvT 

This is called a ID sine function. The 1st null is at // = A /l, the 1st sidelobe is at 
H = 3/2 (A//) and is of strength 2/(3n). The strength of the power pattern 1st sidelobe is 
(2/37r) 2 = 4.5%. This illustrates two very general properties of Fourier transforms: 

1. the width of a function is inversely proportional to width of its transform ( so large 
antennas will have small beams and small antennas will have large beams), and 

2. any sharp discontinuities in the function will give rise to sidelobes (‘ringing’) in the 
fourier transform. 

Figure 3.14 shows a plot of the the power and field patterns for a 700 ft, uniformly 
illuminated aperture at 2380 MHz. 

Aperture illumination design hence involves the following following tradeoffs (see also 
Chapter 19): 

1. A more tapered illumination will have a broader main beam (or equivalently smaller 
effective aperture) but also lower side lobes than uniform illumination. 

2. If the illumination is high towards the edges, then unless there is a very rapid cutoff 
(which is very difficult to design, and which entails high sidelobes) there will be a lot 
of spillover. 

Another important issue in aperture illumination is the amount of aperture blockage. 
The feed antenna is usually suspended over the reflecting surface (see Figure 3.3) and 
blocks out part of the aperture. If the illumination is tapered, then the central part of the 
aperture has the highest illumination and blocking out this region could have a drastic 
effect on the power pattern. Consider again a ID uniformly illuminated aperture of length 
1 with the central portion of length d blocked out. The far field of this aperture is (from 
the linearity of fourier transforms) just the difference between the far field of an aperture 
of length l and an aperture of length d, i.e. 
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Figure 3.14: Power and field patterns for a ID uniformly illuminated aperture. 

or the normalized field pattern is: 

. . A r sin. ( 7t ljj. /A) sin(Trdfj,/X) 

(.AO /7 J\ [ J 

The field pattern of the “missing” part of the aperture has a broad main beam (since 
d < l). Subtracting this from the pattern due to the entire aperture will give a resultant 
pattern with very high sidelobes. In Figure 3.15 the solid curve is the pattern due to the 
entire aperture, the dotted line is the pattern of the blocked part and the dark curve is 
the resultant pattern. (This is for a 100ft blockage of a 700 ft aperture at 2380 MHz). 
Aperture blockage has to be minimized for a ‘clean’ beam, many telescopes have feeds 
offset from the reflecting surface altogether to eliminate all blockage. 

As an example of what we have been discussing, consider the Ooty Radio Telescope 
(ORT) shown in Figure 3.16. The reflecting surface is a cylindrical paraboloid (530m x 30m) 
with axis parallel to the Earth’s axis. Tracking in RA is accomplished by rotating the 
telescope about this axis. Rays falling on the telescope get focused onto the a line focus, 
where they are absorbed by an array of dipoles. By introducing a linear phase shift 
across this dipole array, the antenna beam can be steered in declination (the “phase 
shift” property of Fourier transforms). The reflecting surface is only part of a paraboloid 
and does not include the axis of symmetry, the feed is hence completely offset, there is no 
blockage. The beam however is fan shaped, narrow in the RA direction (i.e. that conjugate 
to the 530m dimension) and broad in the DEC (i.e. that conjugate to the 30m dimension). 

Aperture blockage is one of the reasons why an antenna’s power pattern would deviate 
from what one would ideally expect. Another common problem that affects the power 




22 


CHAPTER 3. SINGLE DISH RADIO TELESCOPES 



Figure 3.15: Effect of aperture blockage on the power pattern. 


pattern is the location of the feed antenna. Ideally the feed should be placed at the focus, 
but for a variety of reasons, it may actually be displaced from the focus. For example, 
as the antenna tracks, the reflecting surface gets distorted and/or the feeds legs bend 
slightly, and for these reasons, the feed is displaced from the actual focal point of the 
reflector. In an antenna like the GMRT, there are several feeds mounted on a cubic 
turret at the prime focus, and the desired feed is rotated into position by a servo system 
(see Chapter 19). Small errors in the servo system could result in the feed pointing 
not exactly at the vertex of the reflector but along some slightly offset direction. This is 
illustrated in Figure 3.17. For ease of analysis we have assumed that the feed is held 
fixed and the reflector as a whole rotates. The solid line shows the desired location of the 
reflector (i.e. with the feed pointing at its vertex) while the dashed line shows the actual 
position of the reflector. This displacement between the desired and actual positions 
of the reflector results in an phase error (produced by the excess path length between 
the desired and actual reflector positions) in the aperture plane. From the geometry of 
Figure 3.17 this phase error can be computed, and from it the corresponding distortion 
in the field and power patterns can be worked out. Figure 3.18[A] shows the result of 
such a calculation. The principal effect is that the beam is offset slightly, but one can 
also see that its azimuthal symmetry is lost. Figure 3.18(B) shows the actual measured 
power pattern for a GMRT antenna with a turret positioning error. As can be seen, the 
calculated error pattern is a fairly good match to the observed one. Note that in plotting 
Figure 3.18(B) the offset in the power pattern has been removed (i.e. the power pattern 
has been measured with respect to its peak position). 

Further Reading 
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Figure 3.16: The Ooty radio telescope. 



[A] 


[B] 



Figure 3.17: Turret positioning error. Ideally the feed should point at the vertex of the 
reflecting surface, but if the feed turret rotation angle is in error then the feed points 
along some offset direction. 
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Azimuth (arcmin) 
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A Az (;> 


units= dB scale fac = 1.0e+01 levs = 
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-0.40 -0.20 0.00 


units= UNKNOWN scale fac = 1.0e—02 levs = 
5.01 6.31 7.94 10.00 12.59 15.85 

19.95 25.12 31.62 39.81 50.12 63.10 

79.43 


(A) 


(B) 


Figure 3.18: [A] Calculated beam pattern for a turret positioning error. [B] Measured 
beam pattern for a turret positioning error. The offset in the pattern has been removed, 
i.e. the power pattern has been measured with respect to its peak position. 





Chapter 4 

Two Element Interferometers 

Jayaram N. Chengalur 


4.1 Introduction 

From the van-Cittert Zernike theorem (see Chapter 2) it follows that if one knows the mu¬ 
tual coherence function of the electric field, then the source brightness distribution can 
be measured 1 . The electric field from the cosmic source is measured using an antenna, 
which is basically a device for converting the electric field into a voltage that can then be 
further processed electronically (see Chapter 3). In this chapter we will examine exactly 
how the mutual coherence function is measured. 



Figure 4.1: A basic two element interferometer 


1 Or in plain english, one make make an image of the source 


1 
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We start by looking at the relationship between the output of a two element interfer¬ 
ometer and the wanted mutual coherence function. Large interferometric arrays can be 
regarded as collections of two element interferometers, and for this reason it is instructive 
to understand in detail the working of a two element interferometer. 


4.2 A Two Element Interferometer 

Consider a two element interferometer shown in Figure 4.1. Two antennas 1,2 whose 
(vector) separation is b. are directed towards a point source of flux density S. The angle 
between the direction to the point source and the normal to the antenna separation vector 
is 9. The voltages that are produced at the two antennas due to the electric field from this 
point source are v\ (t) and V‘> (t) respectively. These two voltages are multiplied together, 
and then averaged. Let us start by assuming that the radiation emitted by the source is 
monochromatic and has frequency v. Let the voltage at antenna 1 be v \ (t) = cos(27 rut). 
Since the radio waves from the source have to travel an extra distance 6sin# to reach 
antenna 2, the voltage there is delayed by the amount 6 sin 0/c. This is called the geometric 
delay, r g . The voltage at antenna 2 is hence r> 2 (f) = cos(27w(f—r g )), where we have assumed 
that the antennas have identical gain. r(r s ), the averaged output of the multiplier is 
hence: 


1 ft+T / 2 

r(r g ) = — cos(27n4) cos(27m(t — r g ))dt (4.2.1) 

T Jt-T/2 

^ rt+T/2 

= — I (cos(47wf — 27rr g ) + cos(2nvT g ))dt 

T Jt-T/2 

= cos(27 ivT g ) 

where we have assumed that the averaging time T is long compared to l/v. The 
cos(47 ivt) factor hence averages out to 0. As the source rises and sets, the angle 0 changes. 
If we assume that the antenna separation vector, (usually called the baseline vector or just 
the baseline ) is exactly east west, and that the source’s declination 4 (l = 0, then 9 = Ll E t, ( 
where fl E is the angular frequency of the earth’s rotation) we have: 

r(r g ) = cos(27m x b/c x sin(f l E (t — t z ))) (4.2.2) 

where t z is the time at which the source is at the zenith. The output r(r g ), (also called 
the fringe), hence varies in a quasi-sinusoidal form, with its instantaneous frequency 
being maximum when the source is at zenith and minimum when the source is either 
rising or setting (Figure 4.2). 

Now if the source’s right ascension was known, then one could compute the time at 
which the source would be at zenith, and hence the time at which the instantaneous 
fringe frequency would be ma xi mum. If the fringe frequency peaks at some slightly dif¬ 
ferent time, then one knows that assumed right ascension of the source was slightly in 
error. Thus, in principle at least, from the difference between the actual observed peak 
time and the expected peak time one could determine the true right ascension of the 
source. Similarly, if the source were slightly extended, then when the waves from a given 
point on the source arrive in phase at the two ends of the interferometer, waves arising 
from adjacent points on the source will arrive slightly out of phase. The observed ampli¬ 
tude of the fringe will hence be less than what would be obtained for a point source of the 
same total flux. The more extended the source, the lower the fringe amplitude 2 . For a 

2 assuming that the source has a uniform brightness distribution 
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time -- 

Figure 4.2: The output of a two element interferometer as a function of time. The solid 
line is the observed qausi-sinosoidal output (the fringe), the dotted line is a pure sinusoid 
whose frequency is equal to the peak instantaneous frequency of the fringe. The instan¬ 
taneous fringe frequency is ma xi mum when the source is at the zenith (the center of the 
plot) and is minimum when the source is rising (left extreme) or setting (right extreme). 


sufficiently large source with smooth brigtness distribution, the fringe amplitude will be 
essentially zero 3 . In such circumstances, the interferometer is said to have resolved out 
the source. 

Further, two element interferometers cannot distinguish between sources whose sizes 
are small compared to the fringe spacing, all such sources will appear as point sources. 
Equivalently when the source size is such that waves from different parts of the source 
give rise to the same phase lags (within a factor that is small compared to 7 r), then the 
source will appear as a point source. This condition can be translated into a limit on AO, 
the minimum source size that can be resolved by the interferometer, viz., 

7ti/A Ob/c < 7 r => A 0 < A/6 

i.e., the resolution of a two element interferometer is ~ A/6. The longer the baseline, 
the higher the resolution. 

Observations with a two element interferometer hence give one information on both 
the source position and the source size. Interferometers with different baseline lengths 
and orientations will place different constraints on the source brightness, and the Fourier 
transform in the van Cittert-Zernike theorem can be viewed as a way to put all this 
information together to obtain the correct source brightness distribution. 


4.3 Response to Quasi-Monochromatic Radiation 

Till now we had assumed that the radiation from the source was monochromatic. Let us 
now consider the more realistic case of quasi-monochromatic radiation, i.e. the radiation 

3 This is related to the fact that in the double slit experiment, the interference pattern becomes less distinct 
and then eventually disappears as the source size is increased (see e.g. Born & Wolf, ‘Principles of Optics’, 
Sixth Edition, Section 7.3.4). In fact the double slit setup is mathematically equivalent to the two element 
interferometer, and much of the terminology in radio interferometry is borrowed from earlier optical terminology. 
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spectrum 4 contains all frequencies in a band Av around v, with Av small compared 
to v. If the radiation at some frequency v arrives in phase at the two antennas in the 
interferometer, the radiation at some adjacent frequencies will arrive out of phase, and 
if Av is large enough, there will be frequencies at which the radiation is actually 180 
degrees out of phase. Intuitively hence one would expect that averaging over all these 
frequencies would decrease the amplitude of the fringe. More rigorously, we have 


r ( T g) = 


rV+- 


/ cos(27risTg)di' 


i 


-Re 




A2‘kvt 1 


Av 

= COs(27 TVTg) 


a dv 


sin(7rAi/T g ) 

TrAvTn 


(4.3.3) 


The quantity in square brackets, the sine function, decreases rapidly with increasing 
bandwidth. Hence as one increases the bandwidth that is accepted by the telescope, 
the fringe amplitude decreases sharply. This is called fringe washing. However, since in 
order to achieve reasonable signal to noise ratio one would require to accept as wide a 
bandwidth as possible 5 , it is necessary to find a way to average over bandwidth without 
losing fringe amplitude. To understand how this could be done, it is instructive to first 
look at what the fringe would be for a spatially extended source. 

Let the direction vector to some reference point on the source be s 0 , and further as¬ 
sume that the source is small that it lies entirely on the tangent plane to the sky at s 0 , 
i.e. that the direction to any point on the source can be written as s = s 0 + er, s 0 .er= 0, 
T g = s 0 .b. Then, from the van Cittert-Zernike theorem we have 6 : 


r(r g ) 


Re 


— IATTS. D 

/(s)e A ds 


— i 2ttsq .b / —t27rcr.b 

e x / /(s)e A ds 


Re 

|V| COs(27 TVTg + $v) 


where V, the complex visibility is defined as: 


(4.3.4) 


V = |V|e -i$v = J /( s)e^ (4.3.5) 

The information on the source size and structure is contained entirely in V, the factor 
cos(27 TVTg) in eqn. (4.3.4) only contains the information that the source rises and sets as 
the earth rotates. Since this is trivial and uninteresting, it can safely be suppressed. 
Conceptually, the way one could suppress this information is to introduce along the elec¬ 
trical signal path of antenna 1 an instrumental delay r, which is equal to r g . Then we will 
have r(r g ) = |V| cos($v), i.e. the fast fringe oscillation has been suppressed. One can then 
average over frequency and not suffer from fringe washing. Since r g changes with time as 
the source rises and sets, r, will also have to be continuously adjusted. This adjustment 

4 Radiation from astrophyslcal sources Is inherently broadband. Radio telescopes however have narrow band 
filters which accept only a small part of the spectrum of the infalling radiation. 

5 See Chapter 5 

6 apart from some constant factor related to the gain of the antennas which we have been ignoring throughout. 
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of Tj is called delay tracking. In most existing interferometers however, the process of 
preventing fringe washing is slightly more complicated than the conceptual scheme de¬ 
scribed above. The complication arises because delay tracking is usually done digitally in 
the baseband, i.e. after the whole chain of frequency translation operations described in 
Chapter 3. The geometric delay is however suffered by the incoming radiation, which is 
at the RF frequency. 



Figure 4.3: A two element interferometer with fringe stopping and delay tracking (see 
text). 


4.4 Two Element Interferometers in Practice 

To see this more clearly, let us consider the interferometer shown in Figure 4.3. The 
signals from antennas 1,2 are first converted to a frequency v BB using a mixer which is 
fed using a local oscillator of frequency 7 v L o, i.e. vlo = vrf — v BB . Along the signal path 
for antenna 1 an additional instrumental delay n = t 9 +At is introduced, as is also a time 
varying phase shift $/. The reasons for introducing this phase shift will be clear shortly. 
Then (see also equations 4.2.1 and 4.3.4) we have: 


r(Tg ) = I V| (cos($v + 2nv BB t- 2nv RF T g ) cos(2Tru BB (t - t;) + $/)) (4.4.6) 

= |V|cos($v + 27 t(is rf - v BB )r g - u BB A r - $/) 

= |V| cos($y + 2Fv B oT g — f bb At — $/) (4.4.7) 


7 Note that it is important that the phase of the local oscillator signal be identical at the two antennas, i.e. 
the local oscillator signal has to be distributed in a phase coherent way to both antennas in the interferometer. 
Chapter 23 explains how this is acheived at the GMRT. 
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So, in order to compensate for all time varying phase factors, it is not sufficient to have 
n = T g , one also needs to introduce a time varying phase 4/ = 2i tv L ot 9 . This additional 
correction arises because the delay tracking is done at a frequency different from u RF . 
The introduction of the time varying phase is called fringe stopping. Fringe stopping can 
be achieved in a variety of ways. One common practice is to vary the instantaneous phase 
of the local oscillator signal in arm 1 of the interferometer by the amount $ f. Another 
possibility (which is the approach taken at the GMRT), is to digitally multiply the signal 
from antenna 1 by a sinusoid with the appropriate instantaneous frequency. 

Another consequence of doing delay tracking digitally is that the geometric delay can 
be quantized only upto a step size which is related to the sampling interval with which 
the signal was digitized. In general therefore Ar is not zero, and is called the fractional 
sampling time error. Correction for this error will be discussed in the Chapter 9. 

The delay tracking and fringe stopping corrections apply for a specific point in the 
sky, viz. the position s 0 . This point is called the phase tracking center 8 . Signals, such 
as terrestrial interference, which enter from the far sidelobes of the antennas do not 
suffer the same geometric delay r g as that suffered by the source. Consequently, delay 
tracking and fringe stopping introduces a rapidly varying change in the phase of these 
signals. On long baselines, where the fringe rate is rapid, the terrestrial interference could 
hence get completely decorrelated. While this may appear to be a terrific added bonus, 
in principle, terrestrial interference is usually so much stronger than the emission from 
cosmic sources, that even the residual correlation is sufficient to completely swamp out 
the desired signal. 

We end this chapter by re-establishing the connection between what we have just done 
and the van Cittert-Zernike theorem. The first issue that we have to appreciate is that 
the van Cittert-Zernike theorem deals with the complex visibility, V = |V|e _,$v . However, 
the quantity that has been measured is r(r g ) = |V| cos(—$y). If one could also measure 
|V|sin(—$v)» then of course one could reconstruct the full complex visibility. This is in¬ 
deed what is done at interferometers. Conceptually, one has two multipliers instead of 
the one in Figure 4.3. The second multiplier is fed the same input as that in Figure 4.3, 
except that an additional phase difference of tt/2 is introduced in each signal path. As 
can be easily verified, the output of this multiplier is |V| sin(— <l>y). Such an arrangement 
of two multipliers is called a complex correlator. The two outputs are called the sine and 
cosine outputs respectively. For quasi-sinsoidal processes, one has to introduce a tt/2 
phase difference at each frequency present in the signal. The corresponding transforma¬ 
tion is called a Hilbert transform 9 . How the complex correlator is achieved at the GMRT 
is described in Chapter 9. The output of the complex correlator is hence a single com¬ 
ponent of the Fourier transform of the source brightness distribution 10 . The component 
measured depends on the antenna separation as viewed from the source, i.e. (b.s 0 )/A, 
which is also called the projected baseline length. For a large smooth source, the Fourier 
transform will be sharply peaked about the origin, and hence the visibility measured on 
long baselines will be small. 

Further Reading 

1. Thompson, R. A., Moran, J. M. & Swenson, G. W. Jr., ‘Interferometry & Synthesis in 
Radio Astronomy’, Wiley Interscience. 

8 For maximum sensitivity, one would also point the antennas such that their primary beam maxima are also 
clt So • 

9 see Chapter 1 

10 This is true only if the antenna dimensions are neglected. Strictly speaking, the measured visibility is 
an average over the visibilities in the range b + a to b — a where a is the diameter of the antennas and b is 
the separation between their midpoints. As will be seen in Chapter 14 the fact that one has information on 
visibilities on scales smaller than b is useful when attempting to image large regions of the sky- 
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2. R. A. Perley, F. R. Schwab, & A. H. Bridle, eds., ‘Synthesis Imaging in Radio Astron¬ 
omy’, ASP Conf. Series, vol. 6. 
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Chapter 5 

Sensitivity and Calibration for 
Interferometers 


Jayaram N. Chengalur 


5.1 Sensitivity 

As we discussed earlier, an aperture synthesis telescope can be regarded as a collection of 
two element interferometers. Hence, for understanding the sensitivity of such a telescope, 
it is easier to first start with the case of a two element interferometer. Consider such an 
interferometer composed of two antennas i,j, (of identical gains, but possibly different 
system temperatures), looking at a point source of flux density S. We assume that the 
point source is at the phase center 1 and hence that in the absence of noise the visibility 
phase is zero. Let the individual antenna gains 2 be G and system temperatures be T Si 
and T s .. If m(t) and rij(t) are the noise voltages of antennas i and j respectively,then 
of = (?t 2 (f)) = T Si , and of = (n 2 (f)) = T s .. Similarly if Vi(t) and Vj(t) are the voltages 
induced by the incoming radiation from the point source, = (tf (t)) = GS. The 

instantaneous correlator 3 output is given by: 

nj(t) = (' Vi(t ) + ni(t)) (vj(t) + nj(t)) 

The mean 4 of the correlator output is hence: 

(nj(t)) = (( Vi{t ) + n.i(t)) ( Vj(t ) + nj(t))) 

= {Vi{t)Vj{t)) 

= GS (5.1.1) 

where we have assumed that the noise voltages of the two antennas are not correlated, 
and also of course that the signal voltages are not correlated with the noise voltages. (t) 
is hence an unbiased estimator of the true visibility. 

To determine the noise in the correlator output, we would need to compute the rms of 
ry (t) for which we need to be able to work out: 

1 See Chapter 4. 

2 Here the gain is taken to be in units of Kelvin per Jansky of flux in the matched polarization 

3 Here we are dealing with an ordinary correlator, not the complex correlator introduced in the chapter on two 
element interferometers. 

4 Note that the average being taken over here is ensemble average, and not an average over time. 
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{nj(t)rij{t)) = (Oi + ni)(vj + nj)(vi + m)(vj + nj )) 


where for ease of notation we have stopped explicitly specifying that all voltages are 
functions of time. This quantity is not trivial to work out in general. However, if we 
assume that all the random processes involved are Gaussian processes 5 the complex¬ 
ity is considerably reduced because for Gaussian random variables the fourth moment 
can then be expressed in terms of products of the second moment. In particular 6 , if 
X \, ,x' 2 , £ 3 , & £4 have a joint gaussian distribution then: 

(£l£ 2 £’3£4) = ( 2 : 12 : 2 ) (£3£4) + (£’l£3) (a: 2 £4) + 

(£l£ 4 ) (£ 2 £3) (5.1.2) 

Rather than directly computing (ry (t)ry (t))> it is instructive first to consider the more 
general quantity 

( nj{t)ru{t )) = ((Vi + rii){vj + rij)(v k + n k ){vi + m )) 

viz. the cross-correlation between the outputs of interferometers ( ij ) and ( kl). We 
have: 


( rij(t)r k i{t )) = ((vi + rii)(vj + nj)) (( v k + n k ){vi + m)) + 
((vi + rii)(v k + n k )) ((vj + rij)(vi + m)} + 
((Vi + Tli){vi + m)) (( V k + n k ){Vj + Tlj)) 


= ((viVj) + (n 2 i) Sij)((v k vi) + ( n \) 40 + 
((viv k ) + (nf) 6 ik ){(vjVi) + (n|) 40 + 
((viVi) + {nj) 6ii)({v k Vj) + (nj) 6 kj ) 


— (GS) 2 + GS(cr 2 4j + ^k^ki) + ajdijajdki + 

(GS) + GS(of 4fc + (TjSji) + <J~Si k aj6ji + 

(GS) 2 + GS(a 2 4 + <J 2 k 6 kj ) + ajSua 2 k S kj (5.1.3) 

The case we are currently interested in is (rij(t)rij(t)), which from eqn(5.1.3) is: 

{rij{t)rij{t)) = 3(GS) 2 + (ct 2 + ct 2 )GS + ajaj 

= 2(GS) 2 + (GS + T s 0(GS + T s .) (5.1.4) 

To get the variance of (t) we need to subtract the square of the mean of r l3 (t) from the 
expression in eqn(5.1.4). Substituting for ( Tij{t)) 2 from eqn(5.1.1) we have: 

a. 2 - = (GS) 2 + (GS + T s J(GS + T s .) (5.1.5) 

Note that the angular brackets denote ensemble averaging. In real life of course one 
cannot do an ensemble average. Instead one does an average over time, i.e. we work in 

5 Reeall from the discussion of sensitivity of a single dish telescope that the central limit theorem ensures 
that the signal and noise statistics will be well approximated by a Gaussian. This of course does not include 
'systematics', like eg. interference, or correlator offsets because of bit getting stuck in the on or off mode etc. 

®The derivation of this expression is particularly straightforward if one works with the moment generating 
function: see also the derivation sketched in Chapter 1. 
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terms of a time averaged correlator output r,j (t ), defined as 

1 r t + T / 2 , , 

1 Jt—T/2 


As can easily be verified, (rp) = (r t:] ). However, computing the second moment, viz., 
afj = ( fijfij) - (rij) 2 is slightly more tricky. It can be shown 7 that if x(t) is a zero mean 
stationary process and that x(t) is the time average of x(t) over the interval (t — T/2. t+T/2), 
then 

(i- y) Rxx (t) dT (5.1.6) 

where R xx (t) is the auto-correlation function of x(t), and <x is the variance of x(t). Now, 
if x(t) is a quasi-sinusoidal process with bandwidth A v, then the integral of R xx (t) will 
be negligible outside the coherence time 1/Av. Further, if T » 1/Au, then the factor in 
parenthesis in eqn(5.1.6) can be taken to be ~ 1 for r < 1/Au. Hence we have: 


1 f T/2 1 f°° 

/ Rxx(t) dr ~ - / R xx {t ) dr 
1 J- T/2 1 J- oo 


- rj.'&xx 


(0) 


1 

T2Av 


(5.1.7) 


where S xx (v) = a 2 /2Av is the power spectrum 8 of x(t). From eqn(5.1.7) and eqn(5.1.5) we 
hence have 


4 = 2 tL (^ GS ) 2 + ^ GS + T ^)( GS + T ^)) 


(5.1.8) 


Putting all this together we get that the signal to noise ratio of a two element interferom¬ 
eter is given by: 


(v / 2TAr'GS) 

snr = — 

y (GS) 2 + (GS + T Si )(GS + T aj ) 


(5.1.9) 


There are two special cases which often arise in practice. The first is when the source is 
weak, i.e. GS <C T s . In this case the snr becomes 


(y / 2TA^GS) 

v / rvrr 


For a single dish with the collecting area equal to the sum of the collecting areas of 
antennas i and j (i.e. with gain 2G), and with system temperature T s = ^/T Si T Sj the 
signal to noise would have been a factor of better 9 . The loss of signal to noise in 
the two element interferometer is because one does not measure the auto-correlations of 
antennas i and j. Only their cross-correlation has been measured. In a sigle dish one 
would have effectively measured the cross-correlation as well as the auto-correlations. 


7 Papoulis, Probability, Random Variables & Stochastic Processes’. Third Edition, Chapter 10 

8 Where we have made the additional assumption that x(t) is a white noise process, i.e. that its spectrum is 
flat. The power spectrum for such processes is easily derived from noting that f^° S X x(^)du = cr x , and that for 
a quasi-sinosoidal proccess of bandwidth Av, the integrand is non zero only over an interval '2Au (including the 
negative frequencies). 

9 As you can easily derive from eqns 5.1.1 and 5.1.3 by putting i = j = k = l. Note that in this case eqn 5.1.1 
becomes (■ m(t)) = ( Vi{t ) + n»(i)) (vi(t) + n»(i)) = 2GS + T„ 
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The other special case of interest is when the source is extremely bright, i.e. GS » T s . 
In this case, the signal to noise ratio is: 


(a/TTA^GS) 

\/2(GSp 


y/TAv 


(5.1.11) 


This is as expected, because for very bright sources, one is limited by the Poisson fluc¬ 
tuations of the source brightness, and hence one would expect the signal to noise ratio 
to go as the square root of the number of independent measurements. Since one gets 
an independent measurement every 1/Av seconds, the total number of independent mea¬ 
surements in a time T is just TAv. 

Having derived the signal to noise ratio for a two element interferometer, let us now 
consider the case of an N element interferometer. This can be considered as ;V C 2 two 
element interferometers. Let us take the case where the source is weak. Then from 
eqn(5.1.3) the correlation between ri-iff) and n 3 (t) is given by 

(ri 2 (t)ri 3 (f)) = <j\ 5 i 2 cr\ 5 i 3 -\- <^i 3 C r i<^21 + of £1102£23 

= 0 (5.1.12) 


The outputs are uncorrelated, even though these two interferometers have one antenna 
in common 10 . Similarly, one can show that (as expected) the outputs of two two-element 
interferometers with no antenna in common are uncorrelated. Since the r,/s are all 
uncorrelated with one another, the rms noise can simply be added in quadrature. In 
particular, for an N element array, where all the antennas are identical and have the 
same system temperature, the signal to noise ratio while looking at a weak source is: 

snr = y / N(N — 1 )TAp GS (5 . L13) 

Ts 

This is the fundamental equation 11 that is used to estimate the integration time required 
for a given observation. The signal to noise ratio for an N element interferometer is less 
than what would have been expected for a single dish telescope with area N times that 
of a single element of the interferometer, but only by the factor N/ s/NfN - 1). The lower 
sensitivity is again because the N auto-correlations have not been measured. For large N 
however, this loss of information is negligible. For the GMRT, N = 30 and N/ y/N(N - I) = 
1.02, hence the snr is essentially the same as that of a single dish with 30 times the 
collecting area of a single GMRT dish. 

For a complex correlator 12 , the analysis that we have just done holds separately for 
the cosine and sine channels of the correlator. If we call the outputs of such a correlator 
and rf 4 then it can be shown that the noise in rf 4 and r?„ is uncorrelated. Further 
since the time averaging can be regarded as the adding together of a large number of 
independent samples (~ y/TAu), from the central limit theorem, the statistics of the noise 
in fC and f®. are well approximated as Gaussian. It is then possible to derive the statistics 
of functions of f,-j and rfj, such as the visibility amplitude ly/fY + r?j) and the visibility 
phase (tan -1 f;? ? /f-j). For example, it can be shown that the visibility amplitude has a Rice 
distribution 13 

10 This may seem counter Intuitive, but note that the outputs are only uncorrelated, they are not Independent. 

11 In some references, an efficiency factor r/ is introduced to account for degradation of signal to noise ratio 
because of the noise introduced by finite precision digital correlation etc. This factor has been ignored here, or 
equivalently one can assume that it has been absorbed into the system temperature. 

12 See the chapter on two element interferometers 

13 Papoulis, 'Probability, Random Variables & Stochastic Processes', Third Edition. Chapter 6. 
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For an extended source, the entire analysis that we have done continues to hold, with 
the exception that S should be treated as the correlated part of the source flux density. 
For example, at low frequencies, the Galactic background is often much larger than the 
receiver noise and one would imagine that the limiting case of large source flux density 
(i.e. eqn(5.1.11) is applicable. However, since this background is largely resolved out at 
even modest spacings, its only effect is an increase in the system temperature. 

Finally we look at the noise in the image plane, i.e. after Fourier transformation of the 
visibilities. Since most of the astronomical analysis and interpretation will be based on 
the image, it is the statistics in the image plane that is usually of interest. The intensity 
at some point (l, in) in the image plane is given by: 

1(1, m) = ^ J2 w P V r e ~ l2 * (lUp+mVp) 

P 

where w p is the weight 14 given to the [Ah visibility measurement V p , and there are a 
total of M independent measurements. The cross-correlation function in the image plane, 

(^1(1, m)I(l', m')^ is hence: 

(/(i, m)I(l',i n')) = X) w p w i ( V p V 9 ) e- i27r ^ +mv ^e i2 < l ' u ‘> +m ' v ^ 

P 9 

In the absence of any sources, the visibilities are uncorrelated with one another, and 
hence, we have 

{l(l,m)I(l ,m)) = ^Y, w y P e ~ i2 * W ~ l ' )Up+im ~ m ' )Vp) 

m 

Hence in the case that all the noise on each measurement is the same, and that the 
weights given to each visibility point is also the same, (i.e. uniform tapering), the cor¬ 
relation in the map plane has exactly the same shape as the dirty beam. Further the 
variance in image plane would then be <jy/M, where a 2 , is the noise on a single visibility 
measurement. This is equivalent to eqn(5.1.13), as indeed it should be. 

Because the noise in the image plane has a correlation function shaped like the dirty 
beam, one can roughly take that the noise in each resolution element is uncorrelated. 
The expected statistics after simple image plane operations (like smoothing) can hence 
be worked out. However, after more complicated operations, like the various possible 
deconvolution operations, the statistics in the image plane are not easy to derive. 


5.2 Calibration 

We have assumed till now that we have been working with calibrated visibilities, i.e. free 
from all instrumental effects (apart from some additive noise component). In reality, 
the correlator output is different from the true astronomical visibility for a variety of 
reasons, to do with both instrumental effects as well as propagation effects in the earth’s 
atmosphere and ionosphere. 

At low frequencies, it is the effect of the ionosphere that is most dominant. As is dis¬ 
cussed in more detail in Chapter 16, density irregularities cause phase irregularities in 
the wavefront of the incoming radio waves. One would expect therefore that the image 


14 As discussed in Chapter 1 1 , this weight is in general a combination of weights chosen from signal to noise 
ratio considerations and from synthesized beam shaping considerations. 
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of the source would be distorted in the same way that atmospheric turbulence (‘seeing’) 
distorts stellar images at optical wavelengths. To first order this is true, but for the 
ionosphere the ‘seeing disk’ is generally smaller than the diffraction limit of typical inter¬ 
ferometers. There are two other effects however which are more troublesome. The first is 
‘scintillation’, where because of diffractive effects the flux density of the source changes 
rapidly - the flux density modulation could approach 100%. The other is that slowly vary¬ 
ing, large scale refractive index gradients cause the apparent source position to wander. 
At low frequencies, the source position could often wander by several arc minutes, i.e. 
considerably more than the synthesized beam. As we shall see below, provided the time 
scale of this wander is slow enough, it can be corrected for. 

Let us take the case where the effect of the ionosphere is simply to produce an excess 
path length, i.e. for an antenna i let the excess phase 15 for a point source at sky position 
(l, in) be <t>i{l,m,t ), where we have explicitly put in a time dependence. Then the observed 
visibility on a baseline (i,j) would be 

Vij(t) = Gij(t) j 2-K(lu i i+mv ij ) (5.2.14) 

where 1(1, m) is the sky brightness distribution and we have ignored the primary beam 16 . 
Gij(t) is ‘instrumental phase’, i.e. the phase produced by the amplifiers, transmission 
lines, or other instrumentation along the signal path. If cj>i(l,m,t) were some general, 
unknown function of (, l,m,t ) it would not be possible to reconstruct the true visibility 
from the measured one. However, since the size scale of ionospheric disturbances is ~ 
a few hundred kilometers, it is often the case that (j>i(l,m,t) is constant over the entire 
primary beam, i.e. there is no (l,m) dependence. The source is then said to lie within a 
single iso-planatic patch. In such situations, the ionospheric phase can be taken out of 
the integral, and eqn(5.2.14) reduces to: 


Vy(i) = J I(l,m)e- i2 ^ lu » +mVi ^ (5.2.15) 

If it also the case that the ionospheric and instrumental gains are changing slowly, then 
they can be calibrated in the following manner. Suppose that close to the source of 
interest, there is a calibration source whose true visibility 1A is known. Then one could 
intersperse observations of the target source with observations of the calibrator. For the 
calibrator, dividing the observed visibility 1 A(i) by the (known) true visibility, lA(t) one 
can measure the factor This can then be applied as a correction to 

the visibilities of the target source. For slightly better corrections, one could interpolate 
in time between calibrator observations. This is the basis of what is sometimes called 
‘ordinary’ calibration. The calibrator source is usually an isolated point source, although 
this is not, strictly speaking, necessary. It is sufficient to know the true visibilities 1A- (t). 
Note that if the calibrators absolute flux is also known, then this calibration procedure 
will also calibrate the amplitude scale of the target source 17 . 

In the approach outlined above, in order to calibrate the data one needs to solve for an 
unknown complex number per baseline, (i.e. N(N-1)/2 complex numbers for an N element 
interferometer). If we assume that the correlator itself does not produce any errors 18 , i.e. 
that all the instrumental errors occur in the antennas or the transmission lines, then the 

15 by which we mean the phase difference over what would have been obtained in the absence of the ionosphere 

16 i.e. we have set the factor B(l, m)/V 1 — l' 2 — m' 2 to 1. 

17 provided, as we will discuss in more detail later, that the system temperature does not differ for the target 
source and the calibrator 

18 which is often a good assumption for digital correlators 
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instrumental gain can be written out as antenna based terms, i.e. 

Gij(t) = gi {t)g*(t) (5.2.16) 

where gft) and gj(t) are the complex gains along the signal paths from antennas 1 and 
2. But the ionospheric phase can also be decomposed into antenna based quantities 
(see eqn 5.2.15), and can hence be lumped together with the instrumental phase. Conse¬ 
quently the total unknown complex gains that have to be solved for reduces from N(N-1)/2 
to N, which can be a dramatic reduction for large N. (For the GMRT it is a reduction from 
435 unknowns to 30 unknowns). 

However to appreciate the real power of this decomposition into antenna based gains, 
consider the following quantities. First let us look at the sum of the phases of the raw 
visibilities V 12 , V 23 and V 31 . If we call the true visibility phase by,,, the raw visibility phase 
ip^ and the sum of the instrumental and ionospheric phases x-i< then we have 

i>Vi 2 + ^V 23 + V’vai = X 1 -X 2 + VV 12 +X2-X3 + fv 12 + X3 - Xl + VV 3 i 

l^Vi2 T V , v 23 "F l/v 3 i (5.2.17) 

i.e. over any triangle of baselines the sum of the phases of the raw visibilities is the true 
source visibility. This is called phase closure. Similarly it is easy to show that for any 
baselines 1 , 2 ,3,4, the ratio of the raw visibilities will be the same as the true visibilities, 

IV12IIV34I = IV12IIV34I (5 2 lg) 

|V 23 ||V4i| IV23IV41I 

This is called amplitude closure. For an N element interferometer, we have 1/2N(N — 1) — 
(. N — 1) constraints on the phase and 1/2 N(N — 1) — N constraints on the amplitude. For 
large N, this is considerably more than the N unknown gains that one is solving for. The 
large number of available constraints means that the following iterative scheme would 
work. 

1. Choose a suitable starting model for the brightness distribution. Compute the model 
visibilities. 

2. For this model, solve for the antenna gains, subject to the closure constraints. 

3. Apply these gain corrections to the visibility data, use the corrected data to make a 
fresh model of the brightness distribution. 

For arrays with sufficient number of antennas, convergence is usually rapid. Note how¬ 
ever, for this to work, the signal to noise ratio per visibility point 19 has to be reasonable, 
i.e. 2-3. This is often the case at low frequencies, and this technique of determining 
antenna gains (which is called self calibration) is usually highly successful. 

Note that if one adds a phase \i = ^ndya, + rruyu ,) to each antenna (where l 0 , m 0 are 
arbitrary and ( m,Vi ) are the (u,v) co-ordinates of the ith antenna), the phase closure 
constraints (eqn 5.2.17) continue to be satisfied. That means that in self calibration the 
phases can be solved only upto a constant phase gradient across the uv plane, i.e. the 
absolute source position is lost. Similarly, it is easy to see that the amplitude closure 
constraints will be satisfied even if one multiplies all the gains by a constant number, i.e. 
in self calibration one loses information on the absolute source flux density . The only 
way to determine the absolute source flux density is to look at a calibrator of known flux. 

19 Actually strictly speaking one means the signal to noise ratio over an Interval for which the ionospheric 
phase can be assumed to be constant 
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Since antenna gains and system temperatures are usually stable over several hours 20 , 
it is usually sufficient to do this calibration only once during an observing run. A more 
serious problem at low frequencies is that the Galactic background (whose strength varies 
with location on the sky) makes a significant contribution to the system temperature. 
Hence, when attempting to measure the source flux density, it is important to correct for 
the fact that the system temperature is different for the calibrator source as compared 
to the target source. The system temperature can typically be measured on rapid time 
scales by injecting a noise source of known strength at the front end amplifier. 

Another related way (to selfcal) of solving for the system gains is the following. Suppose 
that the visibility on baselines (i,j) and /) are identical. Then the ratio of the measured 
visibilities is directly related to the ratio of the complex instrumental gains of antennas 
i,j,k & l. If there are enough number of such ‘redundant’ baselines, one could imagine 
solving for the instrumental gains. Some arrays, like the WSRT have equispaced anten¬ 
nas, giving rise to a very large number of redundant baselines, and this technique has 
been successfuly used to calibrate complex sources 21 For a simple source, like a point 
source, all possible baselines are redundant, and this technique reduces essentially to 
self-calibration. 

At the very lowest frequencies (;/ < 200 MHz, roughly for the GMRT) the assumption 
that the source lies within the iso-planatic patch probably begins to break down. The 
simple self calibration scheme outlined above will stop working in that regime. A possible 
solution then, is to solve (roughly speaking) for the phase changes produced by each 
iso-planatic patch. Often the primary beams of several antennas will pass through the 
same iso-planatic patches, so the extra number of degrees of freedom introduced will 
not be substantial, and an iterative approach to solving for the unknowns will probably 
converge 22 . 


5.3 Further Reading 

1. Hamaker J. P., O’Sullivan, J. D. & Noordam, J. E., Journal of the Opt. Soc. Of 
America, 67 , 1122. 

2. Thompson, R. A., Moran, J. M. & Swenson, G. W. Jr., ‘Interferometry & Synthesis in 
Radio Astronomy’, Wiley Interscience. 

3. R. A. Perley, F. R. Schwab, & A. H. Bridle, eds., ‘Synthesis Imaging in Radio Astron¬ 
omy’ 


20 Or change in a predictable manner with changing azimuth and elevation of the antennas 
21 see Noordam, J. E. & de Bruyn A. G., 1982, Nature 299 , 597. 

22 See Subrahmanya, C. R., (in ‘Radio Astronomical Seeing’, J. E. Baldwin & Wang Shouguan eds.) for more 
details 



Chapter 6 

Phased Arrays 

Yashwant Gupta 


6.1 Introduction 

A single element telescope with a steerable paraboloidal reflecting surface is the sim¬ 
plest kind of radio telescope that is commonly used. Such a telescope gives an angular 
resolution ~ X/D, where D is the diameter of the aperture and A is the wavelength of 
observation. For example, for a radio telescope of 100 m diameter (which is about the 
largest that is practically feasible for a mechanically steerable telescope), operating at a 
wavelength of 1 m, the resolution is ~ 30 arc min. This is a rather coarse resolution and 
is much less than the resolution of ground based optical telescopes. 

Use of antenna arrays is one way of increasing the effective resolution and collecting 
area of a radio telescope. An array usually consists of several discrete antenna elements 
arranged in a particular configuration. Most often this configuration produces an un¬ 
filled aperture antenna, where only part of the overall aperture is filled by the antenna 
structure. The array elements can range in complexity from simple, fixed dipoles to fully 
steerable, parabolic reflector antennas. The outputs (voltage signals) from the array el¬ 
ements can be combined in various ways to achieve different results. For example, the 
outputs may be combined, with appropriate phase shifts, to obtain a single, total power 
signal from the array - such an array is generally referred to as a phased array. If the 
outputs are multiplied in distinct pairs in a correlator and processed further to make an 
image of the sky brightness distribution, the array is generally referred to as a correlator 
array (or an interferometer). Here we will primarily be concerned with the study of phased 
arrays, with direct comparison of the performance with correlator arrays, where relevant. 


6.2 Array Theory 

6.2.1 The 2 Element Array 

We begin by deriving the far field radiation pattern for the case of the simplest array, two 
isotropic point source elements separated by a distance d, as shown in Figure 6.1. The 
net far field in the direction 9 is given as 

E(0) = £i e jW2 + fi 2 e" iW2 , (6.2.1) 


1 
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Figure 6.1: Geometry for the 2 element array. 


where = k d sin 9 + S , k = 27r/A is the wavenumber and 6 is the intrinsic phase difference 
between the two sources. E\ and E 2 are the amplitudes of the electric field due to the 
two sources, at the distant point under consideration. The reference point for the phase, 
referred to as the phase centre, is taken halfway between the two elements. If the two 
sources have equal strength, E\ = E 2 = E 0 and we get 

E{9) = 2E 0 cos(r/>/2) (6.2.2) 

The power pattern is obtained by squaring the field pattern. By virtue of the reciprocity 
theorem 1 , E(9) also represents the voltage reception pattern obtained when the signals 
from the two antenna elements are added, after introducing the phase shift 6 between 
them. 

For the case of <5 = 0 and d » A, the field pattern of this array shows sinusoidal 
oscillations for small variations of 9 around zero, with a period of 2 A /d. Non-zero values 
of 6 simply shift the phase of these oscillations by the appropriate value. 

If the individual elements are not isotropic but have identical directional patterns, the 
result of eqn 6.2.2 is modified by replacing E 0 with the element pattern, E,(0). The fi¬ 
nal pattern is given by the product of this element pattern with the cos(^>/2) term which 
represents the array pattern. This brings us to the important principle of pattern mul¬ 
tiplication which can be stated as : The total field pattern of an array of nonisotropic 
but similar elements is the product of the individual element pattern and the pattern of 
an array of isotropic point sources each located at the phase centre of the individual ele¬ 
ments and having the same relative amplitude and phase, while the total phase pattern is 
the sum of the phase patterns of the individual elements and the array of isotropic point 
sources. This principle is used extensively in deriving the field pattern for complicated 
array configurations, as well as for designing array configurations to meet specified field 
pattern requirements (see the book on “Antennas” by J.D. Kraus (1988) for more details). 

6.2.2 Linear Arrays of n Elements of Equal Amplitude and Spacing : 

We now consider the case of a uniform linear array of n elements of equal amplitude, as 
shown in Figure 6.2. Taking the first element as the phase reference, the far field pattern 
is given by 

E(9) = E 0 [l + + e™ + ... + , (6.2.3) 

1 see Chapter 3 
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where ip = kd sin 9 + 5 , k = 2 tt/X is the wavenumber and <5 is the progressive phase 
difference between the sources. The sum of this geometric series is easily found to be 

E(9) = E 0 (6.2.4) 

sm(y>/2) 

If the centre of the array is chosen as the phase reference point, then the above result 
does not contain the phase term of (n — l)ip/2. For nonisotropic but similar elements, E 0 
is replaced by the element pattern, Ei(9), to obtain the total field pattern. 

The field pattern in eqn 6.2.4 has a ma xi mum value of nE 0 when ip = 0, 2tt. 47t, .... The 
ma xi ma at ip = 0 is called the main lobe, while the other maxima are called grating lobes. 
For d < A, only the main lobe maxima maps to the physically allowed range of 0 < 6 < 2tt. 
By suitable choice of the value of 6, this ma xi ma can be “steered” to different values of 9 , 
using the relation kd sin# = — <5. For example, when all the elements of the array are in 
phase (5 = 0), the maximum occurs at 9 = 0. This is referred to as a “broadside” array. 
For a ma xi mum along the axis of the array (9 = 90°), 6 = —kd is required, giving rise to an 
“end-fire” array. The broadside array produces a disc or fan shaped beam that covers a 
full 360° in the plane normal to the axis of the array. The end-fire array produces a cigar 
shaped beam which has the same shape in all planes containing the axis of the array. 
For nonisotropic elements, the element pattern also needs to be steered (electrically or 
mechanically) to match the direction of its peak response with that of the peak of the 
array pattern, in order to achieve the maximum peak of the total pattern. 

For the case of d > A, the grating lobes are uniformly spaced in sin 0 with an inter¬ 
val between adjacent lobe ma xi ma of X/d, which translates to > X/d on the 9 axis (see 
Figure 6.3). 

The uniform, linear array has nulls in the radiation pattern which are given by the 
condition ip = ±2Til/n, Z = 1,2,3,... which yields 


9 = 



(6.2.5) 


For a broadside array (5 = 0), these null angles are given by 


= sin 1 ( ± 


Further, if the array is long (nd X>> IX). we get 

XI 


2irl A 

nkd J 


~ ± nd ~ ± 


Lx 


( 6 . 2 . 6 ) 


(6.2.7) 



Figure 6.2: Geometry for the n element array 
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Grating Lobe 


Main Lobe 


Grating Lobe 



Figure 6.3: Grating lobes for an array of n identical elements. The solid line is the array 
pattern. The broad, dashed line curve is an example of the element pattern. The resultant 
of these two is shown as the dotted pattern. 


where L\ is the length of the array in wavelengths and L\ = (n— l)d/\ ~ nd/X for large n. 
The first nulls occur at l = ±1, and the beam width between first nulls (BWFN) for such 
an array is given by 

2 114 6 

BWFN = —rad = —— deg . (6.2.8) 

L a L \ 

The half-power beam width (HPBW) is then given by 


HPBW 


BWFN 

2 


57.3 


deg 


(6.2.9) 


Similarly, it can be shown that the HPBW of an end-fire array is yj2/L\ (see “Antennas” 
by J.D. Kraus (1988) for more details). 

Such linear arrays are useful for studying sources of size < X/d radians, as only one 
lobe of the pattern can respond to the source at a given time. Also, the source should be 
strong enough so that confusion due to other sources in the grating lobes is not signif¬ 
icant. Linear grating arrays are particularly useful for studying strong isolated sources 
such as the Sun. 

The presence of grating lobes (with amplitude equal to the main lobe) in the response 
of an array is usually an unwanted feature, and it is desirable to reduce their levels as 
much as possible. For non-isotropic elements, the taper in the element pattern provides 
a natural reduction of the amplitude of the higher grating lobes. This is illustrated in 
Figure 6.3. To get complete cancellation of all the grating lobes starting with the first 
one, requires an element pattern that has periodic nulls spaced X/d apart, with the first 
null falling at the location of the first grating lobe. This requires the elements to have 
an aperture of ~ d, which makes the array equivalent to a continuous or filled aperture 
telescope. This can be seen mathematically by replacing E 0 in eqn 6.2.4 by the element 
pattern of an antenna of aperture size d and showing that it reduces to the expression for 
the field pattern of a continuous aperture of size nd. 

The theoretical treatment given above is easily extended to two dimensional antenna 
arrays. 


6.2.3 The Fourier Transform Approach to Array Patterns 

So far we have obtained the field pattern of an array by directly adding the electric field 
contributions from different elements. Now, it is well established that for a given aper¬ 
ture, if the electric field distribution across the aperture is known, then the radiation 
pattern can be obtained from a Fourier Transform of this distribution (see, for example, 
Christiansen & Hogbom 1985). This principle can also be used for computing the field 
pattern of an array. Consider the case of the array pattern for the 2-element array dis¬ 
cussed earlier, as an example. The electric field distribution across the aperture can be 
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taken to be zero at all points except at the location of the two elements, where it is a delta 
function for isotropic point sources. The Fourier Transform of this gives the sinusoidal 
oscillations in sin#, which have also been inferred from eqn 6.2.2. 

Using the Fourier Transform makes it easy to understand the principle of pattern mul¬ 
tiplication described above. When the isotropic array elements are replaced with direc¬ 
tional elements, it corresponds to convolving their delta function electric field distribution 
with the electric field distribution across the finite apertures of these directional elements. 
Since convolution of two functions maps to multiplication of their Fourier Transforms in 
the transform domain, the total field pattern of the array is naturally the product of the 
field pattern of the array with isotropic elements with the field pattern of a single element. 
The computational advantages of the Fourier Transform makes this approach the natural 
way to obtain the array pattern of two dimensional array telescopes having a complicated 
distribution of elements. 


6.3 Techniques for Phasing an Array 

The basic requirement for phasing an array is to combine the signals from the elements 
with proper delay and phase adjustments so that the beam can be pointed or steered in 
the chosen direction. Some of the earliest methods employed techniques for mechanically 
switching in different lengths of cables between each element and the summing point, to 
introduce the delays required to phase the array for different directions. The job became 
somewhat less cumbersome with the use of electronic switches, such as PIN diodes. 
However, the complexity of the cabling and switching network increases enormously with 
the increase in number of elements and the number of directions for which phasing is 
required. 

Another method of phasing involves the use of phase shifters at each element of the 
array. For example, this can be achieved by using ferrite devices or by switching in incre¬ 
mental lengths of cable (or microstrip delay lines), using electronic switches. The phase 
increments are usually implemented in binary steps (for example A/2, A/4, A/8,...). In 
this scheme, the value of the smallest incremental phase difference controls the accuracy 
of the phasing that can be achieved. 

In most modern radio telescopes, digital electronic techniques are used for processing 
the signals. The output from an antenna is usually down-converted to a baseband fre¬ 
quency in a heterodyne receiver after which it is Nyquist sampled for further processing. 
Techniques for introducing delays and phase changes in the signal in the digital domain, 
using computers or special purpose hardware, are fairly easy to implement and flexible. 

The description of phasing techniques given above applies when the delay compen¬ 
sation of the signals from the different elements of the array is carried out at the radio 
frequency of observation. When this delay compensation is carried out at the intermedi¬ 
ate or baseband frequency of the heterodyne receiver, the signals pick up an extra phase 
term of 27 tvlo T g> where v L o is the local oscillator frequency used for the down conversion 
and T g is the delay (with respect to the phase centre of the array) suffered for the element 
(see for example Thompson, Moran & Swenson, 1986). To obtain the optimum phased 
array signal, these phase terms have to be compensated before the signals from array el¬ 
ements with different values of r g are added. Furthermore, r g for an array element varies 
with time for observations of a given source and this also needs to be compensated. 

For an array with similar elements, the amplitude of the signals from the elements is 
usually kept constant at a common value, while the phase is varied to phase the array. 
However, in the most general case, the amplitude of the signals from different elements 
can be adjusted to enhance some features of the array response. This is most often used 
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to reduce the sidelobe levels of the telescope or shift the nulls of the array pattern to 
desired locations, such as directions from which unwanted interference signals may be 
coming. Arrays where such adjustments are easily and dynamically possible are called 
adaptive beam-forming arrays, and are discussed further in Chapter 7. 


6.4 Coherently vs Incoherently Phased Array 

Normally, the signals from an n-element phased array are combined by adding the volt¬ 
age signals from the different antennas after proper delay and phase compensation. This 
summed voltage signal is then put through a square-law detector and an output pro¬ 
portional to the power in the summed signal is obtained. For identical elements, this 
phased array gives a sensitivity which is n times the sensitivity of a single element, for 
point source observations. The beam of such a phased array is much narrower than that 
of the individual elements, as it is the process of adding the voltage signals with different 
phases from the different elements that produces the narrow beam of the array pattern. 
For some special applications, it is useful to first put the voltage signal from each element 
of the array through a square-law detector and then add the powers from the elements 
to get the final output of the array. This corresponds to an incoherent addition of the 
signals from the array elements, whereas the first method gives a coherent addition. In 
the incoherent phased array operation, the beam of the resultant telescope has the same 
shape as that of a single element, since the phases of the voltages from individual el¬ 
ements are lost in the detection process. This beam width is usually much more than 
the beam width of the coherent phased array telescope. The sensitivity to a point source 
is higher for the coherent phased array telescope as compared to the incoherent phased 
array telescope, by a factor of s/n. 

The incoherent phased array mode of operation is useful for two kinds of astronomical 
obervations. The first is when the source is extended in size and covers a large fraction of 
the beam of the element pattern. In this case, the incoherent phased array observation 
gives a better sensitivity. The second case is when a large region of the sky has to be 
covered in a survey mode (for example, in a survey of the sky in search for new pulsars). 
Here, the time taken to cover the same area of sky to equal senstivity level is less for 
the incoherent phased array mode. Only for a filled aperture phased array telescope are 
these times the same. For a sparsely filled physical aperture such as an earth rotation 
aperture synthesis telescope, this distinction between the coherent and incoheret phased 
array modes is an important aspect of phased array operation. 


6.5 Comparison of Phased Array with a Multi-Element In¬ 
terferometer 

As has been mentioned in Section 1, the basic distinction between a phased array and a 
multi-element interferometer is that in a phased array the signals from all the elements 
are added in phase before (or after) being put through a square-law detector, where as 
in a multi-element interferometer, the signals from the elements are correlated in pairs 
for each possible combination of two elements and these outputs are further processed 
to make a map of the brightness distribution. Thus, if the signal from element i is given 
by Vi, the output of the (coherent) phased array can be written as 

= ((£")’} 


( 6 . 5 . 10 ) 
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whereas the interferometer output is given by 

Vij = (ViVj) i,j — 1,2,... ,n; i ^ j (6.5.11) 

Expansion of the right hand side of eqn 6.5.10 produces terms of the kind < Vi Vj > 
and . The first kind are all available from the correlator outputs and, if the correlator 
also records the self products of all the elements, the second kind are also provided 
by the correlator. Thus, by appropriate combinations of the outputs of the correlator 
used in the multi-element interferometer, the phased array output can be synthesised. 
Even the steering of the beam of the phased array can be achieved by combining the 
visibilities from the correlator after multiplying with appropriate phase factors. Also, the 
incoherently phased array output can be synthesised by combining only the self product 
outputs from the correlator. 

However, the network of multipliers required to implement the correlator is a much 
more complicated hardware than the adder and square law detector needed for the 
phased array. Further, the net data rate out of the correlator is much higher than that 
from the phased array output, for data with the same time resolution. Thus, the interfer¬ 
ometer achieves the phased array response in a very expensive manner. This is especially 
true for very compact, point-like sources where observations with an interferometer do 
not provide any extra information about the nature of the source. For example, observa¬ 
tions of pulsars are best suited to a phased array, as these are virtually point sources for 
the interferometer and the requirement for high time resolution that is relevant for their 
studies is more easily met with a phased array output. 


6.6 Further Reading 

1. Kraus, J.D. “Radio Astronomy”, Cygnus-Quasar Books, Ohio, USA, 1986 

2. Kraus, J.D. “Antennas”, McGraw-Hill Book Company, New York, USA, 1988 

3. Thompson, A.R., Moran, J.M. & Swenson, G.W. “Interferometry and Synthesis in 
Radio Astronomy”, John Wiley & Sons, New York, USA, 1986 

4. Christiansen, W.N. &Hogbom, J.A. “Radio Telescopes”, Cambridge University Press, 
Cambridge, UK, 1985 
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Chapter 7 

Imaging With Dipolar Arrays 

N. Udaya Shankar 


In this lecture we will discuss the radio telescopes in which a beamforming network 
is used to combine signals from the antenna elements and may also provide the required 
aperture distribution for beam shaping and side lobe control. 


7.1 Early History of Dipole Arrays 

Radiotelescopes with a variety of antennas of different forms have been built to suit the 
large range of wavelengths over which radio observations are made 1 . Quasi-optical an¬ 
tennas such as parabolic reflectors are considered more appropriate for milli-meter and 
centi-meter wavelengths. At the other end of the radio spectrum, multi element arrays of 
dipole antennas have been preferred for meter and deca-meter wavelengths. 

Early observations in radio astronomy were made using one of the two methods, either 
pencil beam aerials of somewhat lower resolution to investigate the distribution of radio 
emission over the sky, or interferometers to observe bright sources of small angular size. 
However, the observations made during the early 1950’s, showed that to determine the real 
nature of the radio brightness distribution it is necessary to construct pencil beam radio 
telescopes having beam widths of the same order as the separation between the lobes 
of the interferometers then in use (~ 1'). An important step towards such modern high- 
resolution radiotelescopes was the realisation that in many cases even unfilled apertures, 
which contain all the relative positions of a filled aperture, (“skeleton telescopes”) can be 
used to measure the brightness distribution. A cross-type radio telescope, pioneered by 
Mills was the first to demonstrate the principle of skeleton telescopes. 

A cross consists of two long and relatively narrow arrays arranged as a symmetrical 
cross, usually in the N — S and the E — W directions, intersecting at right angles at their 
centers (Figure 7.1). Each array has a fan beam response, narrow along its length and 
wide in a perpendicular direction 2 . The outputs from both the arrays are amplified and 
multiplied together; only sources of radiation that lie within the cross hatched portion 
of Figure 7.1(b) produce a coherent signal. Thus an effective pencil beam is produced of 


1 see the illustrations in Chapter 3 

2 See Section 6.2.2 
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angular size determined solely by the length of the two arrays. A substantial number of 
telescopes were constructed based on this principle. 



(a) (b) 

Figure 7.1: A cross type telescope. The arrays in Panel (a) produce the fan beams shown 
in Panel (b). When the outputs of these two arrays are multiplied together, only signals 
originating from the cross hatched region common to both beams produce a coherent 
output. The resolution of such a telescope hence depends only on the lengths of the 
arms. 

The Sydney University telescope was constructed as a cross with aerials of overall 
dimensions approximately 1 mile long and 40 ft wide (Mills et al 1963). The mile-long 
reflectors are in the form of cylindrical parabolas, with a surface of wire mesh. Line feeds 
for two operating frequencies of 408MHz and 111.5MHz were provided at their foci. The 
N — S arm employs a fixed reflector pointing vertically upwards and the beam is directed 
in the meridian plane by phasing the dipoles of the feed. The E — W arm is tiltable 
about its long axis to direct the beam, also in the meridian plane, to intersect the N — S 
response pattern. No phasing was employed in this aerial. The angular coverage was 
55° on either side of the zenith. The E — W aperture is divided into two separate halves 
through which the continuous N — S arm passes. The total collecting area is 400,000 sq.ft. 
This instrument had a resolution of approximately 2'.8 at 408 MHz. This later came to be 
known as the “Mills Cross” and is one of the earliest cross type radio telescope built. In 
order to reduce cost, this telescope was built as a meridian transit instrument. 

Note that in a cross antenna, one quarter of the antenna provides redundant infor¬ 
mation, since all element spacings of a filled aperture are still present if half of one array 
is removed. In fact, it can be shown that the cosine response of a T array is similar to 
that of a full cross. Thus a survey carried out using a T array has the same resolution 
as that of a survey carried out using a cross. However it has a collecting area \/2 times 
lower than the corresponding cross and hence a lower sensitivity. 


7.2 Image Formation 

An array can be considered as a sampled aperture. When an array is illuminated by a 
source, samples of the source’s wavefront are recorded at the location of the antenna 
elements. The outputs from the elements can be subjected to various forms of signal 
processing, where in phase and amplitude adjustments are made to produce the desired 
outputs. If the voltages from elemental antennas are simply added (as in the phased 
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arrays discussed in Chapter 6), the energy received from a large portion of the sky will 
be rejected. When the array is illuminated by a point source this gives the beam of the 
array which is the Fourier transform of the aperture current distribution. A single beam 
instrument can use only a part of the total available time to observe each beam width 
of the sky. One can generate multiple independent beams in the sky by amplifying the 
signals from element separately and combining them with different phase shifts. Such 
a multiple-beam or image forming instrument can observe different directions in the sky 
simultaneously. 

A simple linear array, which generates a single beam, can be converted to a multiple 
beam antenna by attaching phase shifters to the output of each element. Each beam to be 
formed requires one additional phase shifter per element. Thus an N element array needs 
N squared phase shifters. Since the formation of a beam is Fourier transforming the 
aperture distribution, this requirement of N squared phase shifters is very similar to the 
requirement of N squared multipliers for an N point Fourier transform. Such a network is 
known as a Blass network (Figure 7.2). Similar to the fast Fourier transform, we also have 
a Butler beam-forming matrix, which needs only NxlogN elements for beam forming. The 
Butler matrix uses 90° phase-lag hybrid junctions with 45° fixed-phase shifters. Blass and 
Butler networks for a four-element array are shown in the Figure 7.2. If the elemental 
spacing is A/2, the butler matrix produces four beams. Although these beams overlap, 
they are mutually orthogonal. Surprisingly the Butler matrix was developed before the 
development of the FFT. 

There are a number of drawbacks with multiple-beam formers, viz. 

1. It is difficult to reconfigure the beam former. Most multiple beam formers can only 
produce fixed beams. 

2. The separation between the multiple beams cannot be any less than that for orthog¬ 
onal beams. 

3. As the number of beams is increased, one has to keep track of the signal to noise 
ratio (SNR) of the individual beams. 

4. As the array length becomes longer and the total span of the multiple beams in¬ 
creases, the difference between the arrival times of the wave-front from the source 
to the ends of the array become comparable to the inverse of the bandwidth of the 
signal used and the loss of SNR due to bandwidth effects becomes large. 


7.3 Digital Beam Forming 

Digital Beam Forming (DBF) is a marriage between the antenna technology and digital 
technology. Workers in Sonar and Radar systems first developed the early ideas of digital 
beam forming. This coupled with the development of aperture synthesis techniques in 
radio astronomy lead to the development of the modern dipolar arrays. 

An antenna can be considered to be a device that converts spatio temporal signals 
into strictly temporal signals, there by making them available to a wide variety of signal 
processing techniques. From a conceptual point of view, its sampled outputs represent 
all of the data arriving at the antenna aperture. No information is destroyed, at least not 
until the processing begins and any compromises that are made in the processing stages 
can be noted and estimates made of the divergence of the actual system from the ideal. 



4 


CHAPTER 7. IMAGING WITH DIPOLAR ARRAYS 


Antennas Antennas 



(c) 


Figure 7.2: A Blass beam forming networks (Panel (a)). Such a network requires N 2 
phase shifters to form N beams from N antennas. On the other hand, the Butler beam 
forming network (Panels (b) and (c)) requires only N\og(N) phase sifters to achieve the 
same result. 


Digital beam forming is based on the conversion of the RF signal at each antenna 
elements into two streams of binary baseband signals representing cos and sin chan¬ 
nels 3 . These two digital baseband signals can be used to recover both the amplitudes 
and phases of the signals received at each element of the array. The process of digital 
beamforming implies weighting by a complex weighting function and then adding together 
to form the desired output. The key to this technology is the accurate translation of the 
analog signal into the digital regime. Close matching of several receivers is not achieved 
in hardware, but rather by applying a calibration process. It is expected that more and 
more of receiver functions will be implemented using software. Eventually one would ex¬ 
pect that the receiver would be built using software rather than hardware. We shall get 
back to this aspect later. 

Figure 7.3 depicts a simple structure that can be used for beamforming. The pro¬ 
cess represented in Figure 7.3(a) is referred to as element-space beamfroming, where the 
data signals from the array elements are directly multiplied by a set of weights to form 
the desired beam. Rather than directly weighting the outputs from the array elements, 
they can be first processed by a multiple-beam beamformer to form a suite of orthogonal 
beams. The output of each beam can then be weighted and the result combined to pro¬ 
duce a desired output. This process is often referred to as the beam-space beamforming 
(Fig. 7.3(b). 


3 See Section 4.4 
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(a) (b) 

Figure 7.3: Digital beam forming networks. Panel (a) shows an element space beam 
former while Panel (b) shows a beam space digital beam former. 

7.4 Radio Telescopes with Digital Beam Forming Networks 

7.4.1 The Clark Lake TEE-PEE-TEE Telescope 

This telescope is no more existent. I am using it here as a good example of a telescope 
which uses a combination of beam forming and synthesis-imaging techniques. This was 
a fully steerable deca-metric array. This was a T array of 720 conical spiral antennas, 
3.0 km by 1.8 km. It had the best sensitivity in the 25 MHz to 75 MHz. Both its operating 
frequency and beam position were adjustable in less than 1 ms (see Erickson et al. 1982). 

The basic element is a long spiral element utilising eight wires wound around a support 
system that consists of eight parallel filaments. Each element is circularly polarised with 
a diode switch at its apex that rotates its excitation and thus adjusts its phase. Steering of 
the array is accomplished by putting a linear phase gradient across groups of 15 elements, 
called banks. There are 16 banks in the 1800 m N — S arm and 32 banks in the 3000 m 
E — W arm. The output of each bank is brought separately to the central observatory 
building. 

A separate receiver channel is attached to the output of each of the 48 banks. Each 
channel employs a superheterodyne receiver 4 to down convert the signal to 10 MHz. The 
10 MHz output of each of the receiver channel is sampled at a frequency of 12 MHz digitally 
delayed and then cross-correlated in a 512 channel two-bit three level complex correlator. 
An off-line processor removes the fringe rotation 5 introduced by the earth’s rotation and 
integrates the data for periods up to 5 minutes. A Fourier transform then produces a 
map of the area of the sky under observation. These maps may be averaged to effectively 
integrate the signal for periods of hours. 

It’s total collecting area was 250A 2 . The synthesised beam at 30.9 MHz had a width of 
13'.0 x 11'.1 at the zenith. The confusion limit of the telescope was around lJy. It produced 


4 See Section 3.1 

5 See Section 4.4 
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1024 picture elements in a field of view roughly 6° x 4°. 

7.4.2 GEETEE: The Gauribidanur T Array 

GEETEE is a low frequency radiotelescope operating at 34.5 MHz. It is situated near 
Gauribidanur, ~ 80 km from Bangalore, India. The antenna system is a T shaped array 
with 1000 dipoles, 640 in the 1.4 km long E — W array and 360 in the 0.45 km long S array. 
It’s collecting area in the EW x S correlation mode is 18,000 Sq m and has a resolution of 
26' x 42' Sec(S — 14°.1). The EW array consists of four rows of dipoles in the NS direction, 
with 160 dipoles in each row. The S array consists of 90 rows in the NS direction with 
four dipoles each placed in the EW direction. 

A multibeam-forming receiver has been built for GEETEE to obtain long periods of 
interference free observation over as large a patch of sky as possible in one day. A short 
observing time for a wide field survey at low frequencies minimises the effects of the 
ionosphere. For multibeam operation a single row of EW is used in the meridian transit 
mode. Single row was chosen to ma xi mise the coverage in declination. A single beam 
in the EW direction was considered sufficient, as the images are confusion limited. 90 
outputs of the S array are transmitted to the observatory in 23 open-wire transmission 
lines using time division multiplexing. In the observatory building, the signals from the 
EW and S arrays are down-converted to an intermediate frequency of 4 MHz. Then each of 
the S array output is correlated with the EW array output using one-bit correlators. This 
gives 90 visibilities sampled at 5 m intervals along the NS direction. The Fourier transform 
of these visibilities gives 90 multiple beams in the NS direction covering a span of ±47° 
of Zenith angle along the meridian. A two dimensional image of the sky is obtained by 
stacking successive scans across the meridian. 

7.4.3 MOST: The Molonglo Observatory Synthesis Telescope 

A severe disadvantage of the original Mills Cross was that it could make only transit ob¬ 
servations. It was recognized that a steerable telescope was necessary to obtain extended 
observing times and greater sensitivity. To achieve this at a reasonable cost it was decided 
to abandon the NS arm of the cross and provide a new phased system for the EW arm 
only. With this a two dimensional aperture is synthesised using earth rotation synthesis. 
If linear polarisation is used, the position angle of the feeds with respect to the sky will 
also rotate. Hence, the existing linear feeds were replaced by a circularly polarised feeds. 

The usual aperture synthesis procedure accumulates data as points in the spatial 
frequency (u,v) plane and then interpolates them onto a rectangular grid 6 . The map in 
the ( 9 , </>) domain is produced by a fast Fourier transform. An important requirement of 
this method is that the primary beam shape must not vary throughout the observation. 
This makes it unsuitable for the Molonglo telescope where the primary beam is derived 
from a rectangular aperture. Because of the mutual coupling problems together with 
the foreshortening of the effective aperture, the gain of the telescope can vary by over a 
factor of five as the pointing moves from the meridian to 60° from the meridian. This gain 
variation can be removed from the sampled data, but, the change in beam widths during 
observations leading to a large variation in the relative gain, between the center of the 
map and map edges, cannot be corrected for. 

The problem of non-circularity and variability of the primary beam may be overcome 
by the fan beam synthesis or the beam space beam forming. For this the E and the W 
reflector, each 778 m long and 11.6m wide (separated by a gap of 15 m) are divided into 

6 See Chapter 11 
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44 sections of length 17.7 m. The E and W reflectors are tilted about an EW axis by a 
shaft extending the whole length. To control the direction of response in an east-west 
direction a phase gradient is set up between the feed elements by differential rotation. 
Each module output is heterodyned to 11 MHz. A phase controlled transmission line 
running the length of each antenna distributes the Local Oscillator. One of these lines is 
phase switched at 400 Hz. 

The detection and synthesis process involves the formation of a set of contiguous 
fan beams in each antenna. The 44 signals are added together in a resistance array to 
produce 64 real time fan beams. Signals from corresponding beams from each antenna 
are multiplied to produce 64 real time interferometer beams. By switching the phase 
gradient by a small amount every second, these 64 beams are time multiplexed to produce 
either 128, 256, or 384 beams in each 24-second sample. Each beam has an EW width of 
43” and at meridian passage a NS width of 2°.3. The hardware beams have a separation 
of 22” and the time multiplexed beams 11”, which is just under half the Nyquist sampling 
requirement. 

If observations of a particular field extend over hour angles of ±6 h, the fan beam 
rotates through all position angles and synthesis may be performed. The field is repre¬ 
sented by a square array of points corresponding to the projection of the celestial sphere 
onto a plane normal to the earth’s rotation axis. Every 24 seconds, the accumulated signal 
at each of the 4x63 fan beam response angles are added to the nearest (l, m) array points. 
This process continues throughout the 12 hours of synthesis. The computation apart 
from summation includes gain, pointing, and phase corrections; cleaning to improve the 
map; to locate the sources and to measure their flux densities and position. 

7.4.4 Summary 

These three radio telescopes illustrate different methods of imaging using dipolar arrays 
as applied to radioastronomy. GEETEE: One-dimensional image synthesis on the merid¬ 
ian with the entire aperture being present at the same time; CLARK LAKE: A two dimen¬ 
sional image synthesis which gave periods of integration much larger than the meridian 
transit time. The entire aperture was present during an observation schedule; MOST: 
Rotational synthesis which is used to synthesise a large two dimensional array, using a 
linear array. All of them use principles of beam forming. GEETEE and CLARK LAKE use 
the method of measurement of visibilities in the (u, v ) domain, while MOST employs the 
method of direct fan beam synthesis. 

We see that the dipolar arrays are used in the meter wavelength ranges more often 
than at high frequencies. They have very wide fields of view (GEETEE, almost 100°) and 
are very good workhorses for surveying the sky. They are good imaging instruments 
also since they combine the phased array techniques with the principles of synthesis 
imaging to make images. Unfortunately most of the arrays are equipped with a limited 
number of correlators and cannot measure all the possible “n(n — l)/2” baselines with 
“n” aperture elements. Thus they are not well suited for applications of self-calibration. 
Being skeleton telescopes, they have no redundancy in the imaging mode and redundant 
baseline calibration is not easily applicable. (See Chapter 5 for a discussion on self¬ 
calibration and redundant baseline calibration). This has resulted in surveys with limited 
dynamic range capability. None of these low frequency arrays are equipped with feeds 
with orthogonal polarisation. So they are not suitable for polarisation studies. 

While combining the beam forming techniques with the synthesis techniques, one has 
to be very careful about the sampling requirement of the spatial frequencies; otherwise 
one will end up with grating lobes in the synthesised image, even while using linear ar¬ 
rays with contiguous elements spaced A/2 apart. Since the dipolar arrays are employed 
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generally as correlation telescopes and do not have a common collecting area in the arms 
used for correlation, they suffer from the “zero-spacing problem 7 ”. Most often today’s 
receivers employ bandpass sampling 8 and if the sampling frequency is not properly cho¬ 
sen one will lose signal to noise. While imaging with arrays it is not un-common, one 
confronts conflicting requirements between surveying sensitivity and the field of view. 

A question may arise in your minds at this stage - with a handful of telescopes using 
the phased array approach, is there any future for them in radio astronomy? In the 
remainder of this chapter, I will discuss the possible future of dipolar arrays for radio 
astronomy. 


7.5 Square Kilometer Array (SKA) Concept 

In one way or another, all of the various research directions in radioastronomy are limited 
by our current instrumental sensitivities. Only by ensuring the continued access to 
order-of-magnitude improvements in our capabilities, can we ensure a continued high 
rate of discovery! The sensitivity of radio telescopes, in the time between 1940 and 1980, 
have shown an exponential improvement, over at least 6 orders of magnitude (10° mJy to 
0.1 mJy for 1 minute integration time). The radio astronomers are toying with the idea of 
building a telescope with an improvement in sensitivity by a factor of 100 and are hoping 
that it will lead to fundamental scientific advances (Braun, 1996) 

Consideration of the many varied scientific drivers suggests the following basic tech¬ 
nical specifications for the instrument: 

1. A frequency range of 200 to 2000 MHz. 

2. A total collecting area of 1 km 2 

3. Distribution over at least 32 elements. 

The NFRA in their study of the SKA concept suggest that a broad-band, highly inte¬ 
grated phased array antennas should be adopted for such an array. Some of the advan¬ 
tages are: 

1. Phased arrays give “complete” control of beam. The main application considered 
being the adaptive suppression of RFI environment. 

2. Multiple independent beams possible resulting in multiple programs and rapid sur¬ 
veys. 

They are planning development work in this direction in several steps: Adaptive array 
demo, one sq. meter array and a thousand element array and proof of principal arrays. 
Discussion of all these aspects is beyond the scope of this chapter. Instead we end with 
the principle of an adaptive array. 


7.6 Adaptive Beam Forming 

An adaptive beam former is a device that is able to separate signals co-located in the 
frequency band but separated in the spatial domain. This provides a means for sepa¬ 
rating the desired signal from interfering signals. An adaptive beam former is able to 

7 The zero spacing problem refers to the difficulty in imaging very large sources, (whose visibilities peak 
near the origin of the u-v plane) with arrays which provide few to no samples near the u-v plane origin. See 
Section 11.6 for a more detailed discussion. 

8 See Chapter 1 
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Figure 7.4: A two element adaptive array for interference suppression. The array simul¬ 
taneously accepts a signal coming from the zenith, while rejecting an interfering signal 
30° from the zenith by a suitable choice of the weights W,. 


automatically optimise the array pattern by adjusting the elemental control weights until 
a prescribed objective function is satisfied. An algorithm designed for that purpose spec¬ 
ifies the means by which the optimisation is achieved. These devices use far more of the 
information available at the antenna aperture than does a conventional beamformer. 

The procedure used for steering and modifying an array’s beam pattern in order to 
enhance the reception of a desired signal, while simultaneously suppressing interfering 
signals through complex weight selection is illustrated by the following example. Let us 
consider the array shown in Figure 7.4. The array consists of two antennas with a spacing 
of A/2. Let the signal S(t) arriving from a radio source at zenith is the desired signal. Let 
I(t) be an interfering signal arriving from a direction 9 = 7r/6 radians. The signal from 
each element is multiplied by a variable complex weight (w\, u> 2 ) and the weighted signals 
are then summed to form the array output. The array output due to the desired signal is 

Y(t) = A e j2 ” ft [u>i + w 2 ]. (7.6.1) 

For the Y(t) to be equal to S(t), it is necessary that 

RP[w i] + RP[w 2 ] = 1 (7.6.2) 

and 

IP[w i] + IP[w 2 ] = 0. (7.6.3) 

Where RP and IP denote real and imaginary parts of the complex weights. The interfering 
signal arrives at the element 2 with a phase lead of ir/2 with respect to the element 1. 
Consequently the array output due to the interfering signal is given by 


Yi(t) = [Ne^ ft }w i + [Ne j2nft+ * /2 ]w 2 . 


(7.6.4) 
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For the array response to the interference to be zero, it is necessary that 

RP[ Wl ] + RP[jw 2 ] = 0 (7.6.5) 

and 

IP[w 2 \+IP[jw 2 ] = 0. (7.6.6) 

The requirement that the array has to respond to only the radio source and not to the 
interfering signal leads to the solution 

Wi = 1/2 — j 1/2 (7.6.7) 

and 

w 2 = 1/2 + jl/2. (7.6.8) 

With these weights, the array will accept the desired signal while simultaneously rejecting 
the interference. 

The method used in the above example exploits the fact that there is only one direc¬ 
tional interference source and uses the a priori information concerning the frequency and 
the directions of both of the signals. A more practical processor should not require such 
a detailed a priori information about the location, number and nature of sources. But 
this example has demonstrated that a system consisting of an array, which is configured 
with complex weights, provides numerous possibilities for realising array system objec¬ 
tives. We need to only develop a practical processor for carrying out the complex weight 
adjustment. In such a processor the choice of the weighting will be based on the statistics 
of the signal of interest received at the array. Basically the objective is to optimise the 
beamformer response with respect to a prescribed criterion, so that the output contains 
minimal contribution from the interfering signal. 

There can be no doubt about the worsening observing situation in radio astronomy due 
to the increased use of frequency space for communications. But a pragmatic view is that 
it is hopeless to resist the increased use of frequency space by others and we must learn 
to live with it. The saving grace is that the requirements of mobile cellular, satellite and 
personal communication services systems are pushing the advancement in technology to 
provide increasingly faster and less expensive digital hardware. The present trend is to 
replace the analog functions of a radio receiver with software or digital hardware. The 
ultimate goal is to directly digitise the RF signal at the output of the receiving antenna 
and then implement the rest of the radio functions in either digital hardware or software. 
Trends have evolved toward this goal by incorporating digitisation closer and closer to the 
antenna at increasingly higher frequencies and wider bandwidths. It is appropriate that 
the radio astronomer uses this emerging technology to make the future radio telescopes 
interference free. Adaptive arrays hold the key to this endeavour. 
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Chapter 8 

Correlator I. Basics 


D. Anish Roshi 


8 .1 Introduction 

A radio interferometer measures the mutual coherence function of the electric field due 
to a given source brightness distribution in the sky. The antennas of the interferometer 
convert the electric field into voltages. The mutual coherence function is measured by 
cross correlating the voltages from each pair of antennas. The measured cross correlation 
function is also called Visibility. In general it is required to measure the visibility for 
different frequencies (spectral visibility) to get spectral information for the astronomical 
source. The electronic device used to measure the spectral visibility is called a spectral 
correlator. These devices are implemented using digital techniques. Digital techniques 
are far superior to analog techniques as far as stability and repeatability is concerned. 

The first of these two chapters on correlators covers some aspects of digital signal 
processing used in digital correlators. Details of the hardware implementation of the 
GMRT spectral correlator are presented in the next lecture. 


8.2 Digitization 

The signals 1 at the output of the antenna/receiver system are analog voltages. Measure¬ 
ments using digital techniques require these voltages to be sampled and quantized. 

8.2.1 Sampling 

A band limited signal s(t) with bandwidth A v can be uniquely represented by a time series 
obtained by periodically sampling s(t) at a frequency f s (the sampling frequency) which is 
greater than a critical frequency 2Ai/ (Shannon 1949). The signal is said to be ‘Nyquist 
sampled’ if the sampling frequency is exactly equal to the critical frequency 2 Ac. 

The spectrum of signals sampled at a frequency < 2 Ac (i.e. under sampled) is dis¬ 
torted. Therefore the time series thus obtained is not a true representation of the band 
limited signal. The spectral distortion caused by under sampling is called aliasing. 

^or all the analysis presented here we assume that radio astronomy signals are stationary and ergodlc 
stochastic processes with a gausslan probability distribution. We also assume that the signals have zero mean. 


1 
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8.2.2 Quantization 



Figure 8.1: Transfer function of a two bit four level quantizer. The binary numbers 
corresponding to the quantized voltage range from 00 to 11. Quantization of a sine wave 
with such a quantizer is also shown. 

The amplitude of the sampled signal is a continuous value. Digital systems represent 
values using a finite number of bits. Hence the amplitude has to be approximated and 
expressed with these finite number of bits. This processes is called quantization. The 
quantized values are integer multiple of a quantity q called the quantization step. An 
example of two bit (or equivalently four level) quantization is shown in Fig. 8.1. For the 
quantizer q = V max /2 2 , where V max is the ma xi mum voltage (peak-to-peak) that can be 
expressed within an error of ±q/2. 

Quantization distorts the sampled signal affecting both the amplitude and spectrum 
of the signal. This is evident from Fig. 8.1 for the case of a two bit four level quantized 
sine wave. The amplitude distortion can be expressed in terms of an error function 
e(t) = s(t) — s q (t), which is also called the quantization noise. Here s q (t) is the output of the 
quantizer. The variance of quantization noise under certain restricted conditions (such 
as uniform quantization) is q 2 / 12. The spectrum of quantization noise extends beyond the 
bandwidth An of s(t) (see Fig. 8.2). Sampling at the Nyquist rate (2A//) therefore aliases 
the power of the quantization noise outside An back into the spectral band of s(t). For 
radio astronomy signals, the spectral density of the quantization noise within An can 
be considered uniform and is ~ q 2 /\2An (assuming uniform quantization). Reduction in 
quantization noise is hence possible by oversampling s(t) (i.e. f s > 2An) since it reduces 
the aliased power. For example, the signal to noise ratio of a digital measurement of the 
correlation function of s(t) (see Section 8.5) using a Nyquist sampling and a two bit four 
level quantizer is 88% of the signal to noise ratio obtained by doing analog correlation for 
Nyquist sampling and 94% if one were to sample at twice the Nyquist rate. 

The largest value that can be expressed by a quantizer is determined by the number 
of bits (M) used for quantization. This value is 2 M — 1 for binary representation. The 
finite number of bits puts an upper bound on the amplitude of input voltage that can 
be expressed within an error ±q/ 2. Signals with amplitude above the ma xi mum value 
will be ‘clipped’, thus producing further distortion. This distortion is minimum if the 
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Figure 8.2: Power spectrum of band limited gaussian noise after one bit quantization. 
The spectrum of the original analog signal is shown with a solid line, while that of the 
quantized signal is shown with a dotted line. 


probability of amplitude of the signal exceeding +V max /2 and —V max /2 is less than 10“ 5 . 
For a signal with a gaussian amplitude distribution this means that V max = 4.12a, a being 
the standard deviation of s(t). 


8.2.3 Dynamic Range 

As described above, the quantizer degrades the signal if its (peak-to-peak) amplitude 
is above an upper bound V max • The minimum change in signal amplitude that can be 
expressed is limited by the quantization step q. Thus a given quantizer operates over a 
limited range of input voltage amplitude called its dynamic range. The Dynamic range of 
a quantizer is usually defined by the ratio of the power of a sinusoidal signal with peak- 
to-peak amplitude = V max to the variance of the quantization noise. For an ideal quantizer 
with uniform quantization the dynamic range is |2 2M . Thus the dynamic range is larger 
if the number of bits used for quantization is larger. 


8.3 Discrete Fourier Transform 


The Fourier Transform (FT) of a signal s(t) is defined as 


S(w) 



s(t)e- jut dt 


(8.3.1) 
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Discrete Fourier Transform (DFT) is an operation to evaluate the FT of the sampled signal 
s(n) (= s(nj-)) with a finite number of samples (say N). It is defined as 

JV-l 

S(k) = J2 s{.n)e~ j2mk/N ■, 0 < k < N - 1 (8.3.2) 

71=0 

The relationship between FT and DFT and some properties of DFT are discussed here. 

Consider a time series s(n), which is obtained by sampling a continuous band limited 
signal s(t) at a rate f s (see Fig. 8.3). The sampling function is a train of delta function 
111(f). The length of the series is restricted to N samples by multiplying with a rectangular 
window function 11(f). The modification of the signal s(t) due to these operations and the 
corresponding changes in the spectrum are shown in Fig. 8.3. The spectral modifications 
can be understood from the properties of Fourier transforms. The FT of the time series 
can now be written as a summation (assuming N is even) 

f +oo N/2—1 

s M = / S(t) £ S(t- T )e-**dt 

n=—N/2 - ' S 

N/2—1 

= E (8 - 3 - 3) 

n——N/2 - ' s 

What remains is to quantize the frequency variable. For this the frequency domain is 
sampled such that there is no aliasing in the time domain (see Fig. 8.3). This is satisfied 
if Aw = 2ixfs/N. Thus Eq. 8.3.3 can be written as 

N/2—1 

S(kAco) = s(7 1 )e' 1 ^ (8.3.4) 

n——N/2 •'* 

Using the relation Aw// s = 2n/N and writing the variables as discrete indices we get the 
DFT equation. The cyclic nature of DFT (see below) allows n and k to range from 0 to A — 1 
instead of —A/2 to A/2 - 1. 

Some properties that require attention are: 

1. The spectral values computed for A/2 > k > 3A/2 — 1 are identical to those for 
k = —A/2 to A/2 — 1. In fact the computed values have a periodicity equal to A Aw 
which makes the DFT cyclic in nature. This periodicity is a consequence of the 
sampling done in the time and frequency domain (see Fig. 8.3). 

2. The sampling interval of the frequency variable Aw (= 2irf s /N) is inversely propor¬ 
tional to the total number of samples used in the DFT. This is discussed further in 
Section 8.3.1. 

There are several algorithms developed to reduce the number of operations in the DFT 
computation, which are called Fast Fourier Transform (FFT) algorithms. These algorithms 
reduce the time required for the computation of the DFT from 0(A 2 ) to G(N log(A)). The 
FFT implementation used in the GMRT correlator uses Radix 4 and Radix 2 algorithms. 

In the digital implementation of FFTs the quantization of the coefficients e~ j27rn,i: / /v de¬ 
grades the signal to noise ratio the of spectrum. This degradation is in addition to the 
quantization noise introduced by the quantizer. Thus the dynamic range reduces further 
due to coefficient quantization. Coefficient quantization can also produce systematics 
in the computed spectrum. This effect also depends on the statistics of the input sig¬ 
nal, and in general can be reduced only by using a larger number of bits for coefficient 
representation. 



8.4. DIGITAL DELAY 


5 


8.3.1 Filtering and Windowing 

The Fourier transform of a signal s(t) is a decomposition into frequency or spectral com¬ 
ponents. The DFT also performs a spectral decomposition but with a finite spectral res¬ 
olution. The spectrum of a signal s(t) obtained using a DFT operation is the convolution 
of the true spectrum of the signal S(f) convolved by the FT W(f) of the window function, 
and sampled at discrete frequencies. Thus a DFT is equivalent to a filter bank with filters 
spaced at A u> in frequency. The response of each filter is the Fourier transform of the 
window function used to restrict the number of samples to N. For example, in the above 
analysis (see Section 8.3) the response of each ‘filter’ is the sine function, (which is the FT 
of the rectangular window 11(f)). The spectral resolution (defined as the full width at half 
ma xi mum (FWHM) of the filter response) of the sine function is 1 ' 2 2 1 ^ UJ . Different window 
functions w(n) give different ‘filter’ responses, i.e. for 


the Hanning window 


JV-X 

S(k) = w(n)s(n)e- j2 ™ k / N 

n —0 


(8.3.5) 


w(n) = 0.5(1 + cos(27rn/A0) for - N/2 <n<N/ 2-1 (8.3.6) 

= 0 elsewhere 

has a spectral resolution 2Ai£. Side lobe reduction and resolution are the two princi¬ 
pal considerations in choosing a given window function (or equivalently a given filter 
response). The rectangular window (i.e. sine response function) has high resolution but a 
peak sidelobe of 22% while the Hanning window has poorer resolution but peak sidelobe 
level of only 2.6%. 


8.4 Digital Delay 

In interferometry the geometric delay suffered by a signal (see Chapter 4) has to be com¬ 
pensated before correlation is done. In an analog system this can be achieved by adding 
or removing cables from the signal path. An equivalent method in digital processing is 
to take sampled data that are offset in time. Mathematically, s(n - m) is the sample de¬ 
layed by m x l/f s with respect to s(n ) (where f s is the sampling frequency). In such an 
implementation of delay it is obvious that the delay can be corrected only to the nearest 
integral multiple of 1 /f s . 

A delay less than 1 /f s (called fractional delay) can also be achieved digitally. A delay 
r introduced in the path of a narrow band signal with angular frequency to produces a 
phase (j) = lot. Thus, for a broad band signal, the delay introduces a phase gradient 
across the spectrum. The slope of the phase gradient is equal to the delay or r = 4 ^. This 
means that introducing a phase gradient in the FT of s(t) is equivalent to introducing a 
delay is s(t). Small enough phase gradients can be applied to realize a delay < 1 /f s . In 
the GMRT correlator, residual delays r < 1 /f s is compensated using this method. This 
correction is called the Fractional Sampling Time Correction or FSTC. 


8.5 Discrete Correlation and the Power Spectral Density 

The cross correlation of two signals si(t) and s 2 [t) is given by 

R c (t) = < s 1 (t)s 2 (t + T) > 


(8.5.7) 
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where r is the time delay between the the two signals. In the above equation the angle 
bracket indicates averaging in time. For measuring R c (t) in practice an estimator is 
defined as 


R(m) = 


N -1 

E 

n —0 


si(ri)s 2 (n + in) 0 < m < M 


(8.5.8) 


where m denotes the number of samples by which s 2 ( n ) is delayed, M is the ma xi mum 
delay (M <C N). By definition R{m) is a random variable. The expectation value of R{m) 
converges to R c (t = -p) when N —> oo. The autocorrelation of the time series si(n) is also 
obtained using a similar equation as Eq. 8.5.8 by replacing s 2 (n + m) by si(n + in). 

The correlation function estimated from the quantized samples in general deviates 
from the measurements taken with infinite amplitude precision. The deviation depends 
on the true correlation value of the signals. The relationship between the two measure¬ 
ment can be expressed as 

Rc(m/f a ) = F(R(m)) (8.5.9) 

where R c (m/f s ) and R(m) are the normalized correlation functions (normalized with zero 
lag correlation in the case of autocorrelation and with square root of zero lag autocorre¬ 
lations of the signal si(t) and s 2 (t) in the case of cross correlation) and F is a correction 
function. It can be shown that the correction function is monotonic (Van Vleck & Middel- 
ton 1966, Cooper 1970, Hagan & Farley 1973, Kogan 1998). For example, the functional 
dependence for a one-bit quantization (the ‘Van Vleck Correction’) is 


R c (m/f s ) = sin (~R(m)) 


(8.5.10) 


Note that the correction function is non-linear and hence this correction should be 
applied before any further operation on the correlation function. If the number of bits 
used for quantization is large then over a large range of correlation values the correction 
function is approximately linear. 

The power spectral density (PSD) of a stationary stochastic process is defined to be the 
FT of its auto-correlation function (the Wiener-Khinchin theorem). That is if R c (t) = < 

s(t)s(t - t) > then the PSD, S c (f) is 


/ OO 

R c {r)e~^f T dT 

-OO 

From the properties of Fourier transforms we have 

/ OO 

S c (f)df 

-oo 


(8.5.11) 


(8.5.12) 


i.e. the function S c (f) is a decomposition of the variance (i.e. ‘power’) of s(t) into 
different frequency components. 

For sampled signals, the PSD is estimated by the Fourier transform of the discrete 
auto-correlation function. In case the signal is also quantized before the correlation, then 
one has to apply a Van Vleck correction prior to taking the DFT. Exactly as before, this 
estimate of the PSD is related to the true PSD via convolution with the window function. 

One could also imagine trying to determine the PSD of a function s(t) in the following 
way. Take the DFTs of the sampled signal s(n) for several periods of length N and average 
them together and use this as an estimate of the PSD. It can be shown that this process 
is exactly equivalent to taking the DFT of the discrete auto-correlation function. 

The cross power spectrum of the two signals is defined as the FT of the cross cor¬ 
relation function and the estimator is defined in a similar manner to that of the auto¬ 
correlation case. 
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Chapter 9 

Correlator - II: Implementation 


D. Anish Roshi 


The visibility measured by an interferometer is characterized by the amplitude and 
phase of the fringe at different instants. For simplicity first consider the output of a two 
element interferometer. In the quasi monochromatic approximation the multiplier output 
can be written as (see Chapter 4) 

r R (Tg) = Re[vi(v,t)v 2 {v,t)] = |V| cos(27wr g + 4>y), (9.0.1) 

where v± (v,t) and v\ (v.t) are the voltages at the outputs of the receiver systems of the 
two antennas, |V| and $y are the amplitude and the phase of the visibility and r s is the 
geometric delay. The quantities required for mapping a source are |V| and $y for all pairs 
of antennas of the interferometer. These quantities are measured by first canceling the 
27 risTg term in Eq. 9.0.1 by delay tracking and fringe stopping. In general, one needs to 
know the amplitude and phase of the visibility as a function of frequency. This chapter 
covers the implementation of a spectral correlator to measure the visibility amplitude and 
phase. Further since the delay tracking (and fringe stopping for some cases) is usually 
also done by the correlator, these issues are also discussed. 


9.1 Delay Tracking and Fringe Stopping 

Signals received by antennas are down converted to baseband by mixing with a local 
oscillator of frequency vlo- The geometric delay compensation is usually done by intro¬ 
ducing delays in the baseband signal. The output of a correlator after introducing a delay 
Tj can be written as (see Chapter 4) 


rfi(r 3 ) = |V| cos(27 tvt 9 - 27t ubbU + $y) (9.1.2) 

= |V| cos(27T v LO Tg - 2'KVBB^Ti + 4>y) , (9.1.3) 

where v B is is the baseband frequency and At, = r g t, is the residual delay. There are 
two terms that arise in the equation due to delay compensation: 

1. i-KUBB^Ti, and 


1 
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2. 27 tv lo t 9 . 

The first term is due to finite precision of delay compensation and the later is a con¬ 
sequence of the delay being compensated in the baseband (as opposed to the RF, which 
is where the geometric delay is suffered, see Chapter 4). The phase 27 w bb At, depends on 
vbb- For observations with a bandwidth Nu this term produces a phase gradient across 
Aza The phase gradient is a function of time since the delay error changes with time. 
The phase 2n tv L ot s is independent of v B b, thus is a constant across the entire band. This 
phase is also a function of time due to time dependence of r g . Thus both these quantities 
have to be dynamically compensated. 


Shift Register 


S.(n’) 

1 -=5- 











f | f f f f | i 

1 

-^ 


S 2 (n’) 


A A A A A 


S (n+m) 
S 2 (n) 


Delay implementation using shift registers 



Delay implementation using Memory 

Figure 9.1: Digital implementation of delay tracking in units of the sampling period using 
shift registers (top) and random access memory (bottom). 

Delay compensation in multiples of sampling interval 1 / f s can be achieved by shifting 
the sampled data (see Chapter 8). This is schematically shown in Fig. 9.1. The digitized 
samples are passed through shift registers. The length of the shift registers are adjusted 
to introduce the required delay between the signals. Another way of implementing delay 
is by using random access memory (RAM). In this scheme, the data from the antennas 
are written into a RAM (Fig. 9.1). The data is then read out from this memory for further 
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proccessing. However, the read pointer and the write pointer are offset, and the offset 
between the two can be adjusted to introduce exactly the required delay. In the GMRT 
correlator, the delay compensation is done using such a high speed dual port RAM. 

A fractional delay can be introduced by changing the phase of the sampling clock. 
The phase is changed such that signals from two antennas are sampled with a time 
difference equal to the fractional delay. A second method is to introduce phase gradients 
in the spectrum of the signal (see Chapter 8). This phase gradient can be introduced after 
taking Fourier Transforms of signals from the antennas (see Section 9.2.1). 

Compensation of 2ttv L or g , (called .fringe stopping, can be done by changing the phase 
of the local oscillator signal by an amount (j>i,o so that 2tt v L oT g — bi.o = 0. Alternatively, 
this compensation can be achieved digitally by multiplying the sampled time series by 
e -j<t>Lo ' (Recall from above that the fringe rate is the same for all frequency channels, so 
this correction can be done in the time domain). The fringe 

, , . „ bs'm(flt) 

<t>Lo{t) = 2itv LO T g = 2irv LO --- (9.1.4) 

is a non-linear function of time (see Chapter 4). Here H is the rate at which the source is 
moving in the sky (i.e. the angular rotation speed of the earth), b is the baseline length 
and c is the velocity of light. For a short time interval A t about t 0 the time dependence 
can be approximated as 


, , , , . , bn cos(Hto) . 

4>Lo{t) = 4*LO (to) + 27 IVLO -At. 

c 


(9.1.5) 


i.e. <j>Lo(t) is the phase of an oscillator with frequency 


Flo 


bn cos(Hto) 
c 


(9.1.6) 


After a time interval At the frequency of the oscillator has to be updated. Digital 
implementation of an oscillator of this type is called a Number controlled oscillator (NCO). 
The frequency of the oscillator is varied by loading a control number to the device. The 
initial phase of the NCO can also be controlled which is used to introduce ( pLo(to )• In the 
GMRT correlator, fringe stopping is done using an NCO. 


9.2 Spectral Correlator 

The output of a simple multiplier of the two element interferometer after delay compen¬ 
sation can be written as: 

fR = |V| cos($y). (9.2.7) 

To separate |V| and <l>y a second product is measured after introducing a phase shift of 
90 deg in the signal path (see Fig 9.2). Introducing a 90 deg shift in the path of one of the 
signals will result in (se Eq. 9.0.1) 

Ti{T g ) = |V| cos(27 xuTg + 4>y + 7 t/2), (9.2.8) 

and after compensating for 2~uT g 


r i = 


|V| COs($y + 7 t/2) 
|V| sin($v). 


(9.2.9) 
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V, (t) 



Figure 9.2: Block diagram of a complex multiplier. 


From these two measurement we get 


l V l = \j r R + r2 I 

(9.2.10) 

<£y = tan _1 (—). 

(9.2.11) 

tr 


Alternatively, for mathematical convenience, the two measurements 

can be considered as 

the real and imaginary part of a complex number, i.e. 


V = r R +jri 

(9.2.12) 


Thus the pair of multipliers together with an integrator (to get the time average) form the 
basic element of a complex correlator. 

In the above analysis a narrow band signal (quasi monochromatic) is considered. In 
an actual interferometer the observations are made over a finite bandwidth A// and one 
requires the complex visibilities to be measured as a function of frequencies within A u. 
This can be achieved in one of the two ways described below. 

9.2.1 FX Correlator 

The band limited signal can be decomposed into spectral components using a filter bank. 
The spectral visibility is then obtained by separately cross correlating each filter output 
using a complex correlator (see Fig. 9.3). The digital implementation of this method is 
called an FX correlator (F for Fourier Transform and X for multiplication or correlation). 
The GMRT correlator is an FX correlator. A schematic of an FX correlator is shown in 
Fig. 9.4. The analog voltages V\ (f) and F 2 (f) are digitized first using ADCs. The geometric 
delay in steps of the sampling intervals (integral delay) are then compensated for. The 
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Vi(t) 


v 2 (t) 



r R 
r I 


Figure 9.3: A spectral correlator using filter bank and complex multipliers. 



v 2 (t) 


Figure 9.4: Block diagram of an FX correlator. 


integral delay compensated samples are multiplied by the output of NCO for fringe stop¬ 
ping. The samples from each antenna are then passed through an FFT block to realize a 
filter bank. Phase gradients are applied after taking the Fourier Transform for fractional 
delay compensation. The spectral visibility is then measured by multiplying the spectral 
components of one antenna with the corresponding spectral components of other anten¬ 
nas. These are then integrated for some time to get an estimate of the cross correlation. 
Since the Fourier transform is taken before multiplication it is called an FX correlator. For 
continuum observations with an FX correlator the visibility measured from all spectral 
components can be averaged after bandpass calibration. 
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9.2.2 XF Correlator 



To Data Acquisition 
system 


Figure 9.5: Block diagram of a XF correlator. 

Eq. 9.0.1 for a broadband signal after delay compensation and integration (time aver¬ 
age) can be written as 


< r R >= Re 



< v\{y,t)v 2 {y,t) > &v\ , 


(9.2.13) 


where v\{v,t) and v can be considered as the spectral components of the signals 
from the antennas. Introducing a delay of r to one of the signals v\ (v,t) modifies the 
above equation to 


r+oo 

< r R (r) >= Re[/ < vi(v, t)v%{v, t) > e -j27r " r dz/] (9.2.14) 

J — OO 

The above equation is a Fourier Transform equation; the Fourier Transform of the cross 
spectrum < v\> (averaging over /). Thus < r R (r) > is the cross correlation 
measured as a function of r. Since v\ (v,t) and vfc, t) are Hermitian functions, as they 
are spectra of real signals, their product is also hermitian. Therefore < tr(t) > is a real 
function and hence it can be measured with a simple correlator (not a complex correlator). 
Thus the second method of measuring spectral visibility is to measure < t\r(t) > for each 
pair of antennas as a function of r and later perform an Fourier Transform to get the 
cross spectrum. The digital implementation of this method is called an XF correlator. 

A block diagram of an XF correlator is shown in Fig. 9.5. In this diagram, fractional 
delays are compensated for by changing the phase of the sampling clock. After delay 
compensation, the cross correlations for different delay are measured using delay lines 
and multipliers, which are followed by integrator. Since the cross correlation function in 
general is not an even function of r, the delay compensation is done such that the cor¬ 
relation function is measured for both positive and negative values of r in the correlator. 
The zero lag autocorrelations of the signals are also measured, which is used to normalize 
the cross correlation. The quantization correction (block marked as F) is then applied to 
the normalized cross correlations. The cross spectrum is obtained by performing a DFT 
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on the corrected cross correlation function. A peculiarity of this implementation is that 
the correlations are measured first and the Fourier Transform is taken later to get the 
spectral information. Hence it is called an XF correlator. 


9.3 Further Reading 

1. Thompson, R.A., Moran, J.M., Swenson, Jr. G.W., “Interferometry and Synthesis in 
Radio Astronomy”, Chapter 8, John Wiley & Sons, 1986. 

2. Thompson, A.R. & D’Addario, L.R. in “Synthesis Imaging in Radio Astronomy”, R.A. 
Perley, F.R. Schwab, & A.H. Bridle, eds., ASP Conf. Series, vol. 6. 
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Chapter 10 

Mapping I 


Sanjay Bhatnagar 

In the Chapters 2 & 4, the conceptual basis and formulation of aperture synthesis 
in Radio Astronomy has been described. In particular, it has been shown that (1) an 
interferometer records the mutual coherence function, also called the visibility of the 
signals from the sky, and (2) the visibility is the Fourier transform of the sky brightness 
distribution. This chapter describes the coordinate systems used in practical aperture 
synthesis in Radio Astronomy and presents the derivation of the 2D Fourier transform 
relation between the visibility and the brightness distribution. 


10.1 Coordinate Systems 

10.1.1 Angular Co-ordinates 

As described in Chapter 4, the response of an interferometer to quasi-monochromatic 
radiation from a point source located at the phase center is given by 

r(r(f)) = cos{2ttv 0 t), (10.1.1) 

where t = t 0 = ( D/c)sin(9(t )) is the geometrical delay, 9 is the direction which the an¬ 
tennas are tracking with respect to the vertical direction, A is the wavelength, v D is the 
center frequency of the observing band and D is the separation between the antennas. 
As the antennas track the source, the geometrical delay changes as a function of time. 
This changing r is exactly compensated with a computer controlled delay and for a point 
source at the phase center, the output of the interferometer is the amplitude of the fringe 
pattern. 

For a source located at an angle 9 = 9 0 +A9, for small A 9, r = t 0 +A9(D/c)cos(9(t)) . Since 
fringe stopping compensates for r 0 , the response of the interferometer for a source AO 
away from the phase center is cos(2ttA9D\cos(9)) where D\ = D/A. If the phase center is 
shifted by equivalent of A/4, the interferometer will pick up an extra phase of 7r/2 and the 
response will be sinusoidal instead of co-sinusoidal. Hence, an interferometer responds 
to both even and odd part of the brightness distribution. Interferometer response can 
then be written in complex notation as 

r(r(i)) = e 2 ^ AeD ^ cos ^. (10.1.2) 
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Writing u = D\cos(9), which is the projected separation between the antennas in units of 
wavelength in the direction normal to the phase center and l = sin(A9) « A 6, we get 

r(u, l) = e 27TLul = e 2 ^ uAe (10.1.3) 

as the complex response of a two element interferometer for a point source of unit flux 
located A 9 away from the phase center given by the direction 0 o . 

Usually the phase center coincides with the center of the field being tracked by all the 
antennas. Let the normalized power reception patter of antennas (which are assumed to 
be identical) at a particular frequency be B(A9) and the surface brightness of an extended 
source be represented by I ( AO). The response of the interferometer to a point source 
located A 9 away from the phase center would then be I(A9)B(A9)e 2 ' KLuAe . For an extended 
source with a continuous surface brightness distribution, the response is given by 

V(u) = j B(A0)I(A9)e 2 * LuA9 dA0 = J B(l)I(l)e 2 * Lul dl. (10.1.4) 

The above equation is a ID Fourier transform relation between the source brightness 
distribution and the output of the visibility function V. The integral is over the entire sky 
visible to the antennas but is finite only for a range of l limited by the antenna primary 
reception pattern B(l). In practice, u is calculated as a function of the source position 
in the sky, specified in astronomical co-ordinate system, as seen by the observer on the 
surface of the earth. 

I in the above equation is the direction of the elemental source flux relative to the 
pointing center, u then has the interpretation of spatial frequency and V(u) represents 
the ID spatial frequency spectrum of the source. 

10.1.2 Astronomical Co-ordinate System 

The position of a source in the sky can be specified in various spherical coordinates sys¬ 
tems in astronomy, differing from each other by the position of the origin and orientation 
of the axis. The position of the sources are specified using the azimuth and elevation 
angles in these coordinate systems. In the Equatorial Co-ordinate system the source 
position is specified by the Declination (d) which is the elevation of the source from the 
normal to the celestial equator and the Right Ascension (RA), which is the azimuthal 
anlge from a reference position (“the first point of Aries”). The reference direction for RA 
is line of intersection of the equatorial and Ecliptic planes. The position of the source 
in the sky, in this coordinate system, remains constant as earth rotates. The azimuth 
and elevation of the antennas, which rotate with earth, are constantly adjusted to track 
a point in the sky specified by (RA, <5) coordinates. The changing position of the sources 
in the sky, as seen by the observer on the surface of earth is specified by replacing RA 
by Hour Angle (HA), which is the azimuth of the source measured in units of time, with 
respect to the local meridian of the source with HA = —6 h pointing due East. 

10.1.3 Physical Coordinate System 

The antennas are located on the surface and rotate with respect to a source in the sky due 
the rotation of the earth. For aperture synthesis the antenna positions are specified in a 
co-ordinate system such that the separation of the antennas is the projected separation 
in plane normal to the phase center. In other words, in such a co-ordinate system the 
separation between the antennas is as seen by the observer sitting in the source reference 
frame. This system, shown in Fig 10.1, is the right-handed (u,v,w) coordinate system 
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fixed on the surface of the earth at the array reference point, with the (u, v) plane always 
parallel to the tangent plane in the direction of phase center on the celestial sphere and 
the w axis along the direction of phase center. The u axis is along the astronomical E-W 
direction and v axis is along the N-S direction. The (it, v) co-ordinates of the antennas are 
the E-W and N-S components of position vectors. As the earth rotates, the (u, i>) plane 
rotates with the source in the sky, changing the (u,v,w) coordinates of the antennas, 
generating tracks in the wit-plane. 


Celestial Pole 



Figure 10.1: Relationship between the terrestrial co-ordinates (X,Y,Z) and the (u,v,w) co¬ 
ordinate system. The (u, v, w] system is a right handed system with the w axis pointing to 
the source S. 

In the above formulation, the u co-ordinate of one antenna is with respect to the 
other antenna making the interferometer, which is located at the origin. If the origin is 
arbitrarily located and the co-ordinates of the two antennas are v. \ and w, 2 , Eq. 10.1.3 
becomes 

r(u,l)=e 2 ^ Ul - U2 )i. (10.1.5) 

Since only the relative positions of the antennas with respect to each other enter 
the equations, it is only useful to work with difference between the position vectors of 
various antennas in the ( u,v,w ) co-ordinate system. The relative position vectors are 
called “Baseline vectors” and their lengths referred to as “baseline length”. 

The source surface brightness distribution is represented as a function of the direction 
cosines in the (w, v,w) coordinate system. In Eq. 10.1.4 above, l is the direction cosine. 
The source coordinate system, which is flat only for small fields of view, is represented 
by (l ,m,n). Since this coordinate system represents the celestial sphere, n is not an 
independent coordinate and is constrained to be n = — m 2 . 



4 


CHAPTER 10. MAPPING I 


10.1.4 Coordinate Transformation 

To compute the (u,v,w) co-ordinates of the antennas, the antenna locations must first be 
specified in a terrestrial co-ordinate system. The terrestrial coordinate system generally 
used to specify the position of the antennas is a right-handed Cartesian coordinate system 
as shown in Figure 10.2. The (A, F) plane is parallel to the earth’s equator with X in the 
meridian plane and Y towards east. Z points towards the north celestial pole. In terms of 
the astronomical coordinate system (fT4,<5), X = (0 ,l ,0°), Y = (— G^O 0 ) and Z = (8 = 90°). 
If the components of D\ are (X\,Y\,Z\), then the components in the (u,v,w) system can 
be expressed as 


u 


sin(HA) 

cos(HA) 

0 


'X ' 


V 

= 

—sin(S)cos(H A) 

sin(8)sin(H A ) 

cos(8 ) 


Y 

(10.1.6) 

w 


cos(5)cos(H A) 

—cos(8)sin(H A) 

sin (5) 


Z 



As earth rotates, the HA of the source changes continuously, generating different set of 
(u,v,w) co-ordinates for each antenna pair at each instant of time. The locus of projected 
antenna-spacing components u and v defines an ellipse with hour angle as the variable 
given by 


v — ZcosSo 
sinSr, 


= X 2 + Y 2 , 


(10.1.7) 


where ( HA 0l 5 0 ) defines the direction of phase center. In the vu.’-plane, this is an ellipse, 
referred to as the (tu-track with HA changing along the ellipse. The pattern generated by 
all the uv points sampled by the entire array of antennas over the period of observation 
is referred to as the uv -coverage and as is clear from the above transformation matrix, is 
different for different 8. Examples of uv- coverage for a few declinations for full synthesis 
with GMRT array are shown in Figure 10.4. 

The uv domain is the spatial frequency domain and uv-coverage represent the spatial 
frequencies sampled by the array. The shorter baselines (uv points closer to the origin) 
provide the low resolution information about the source structure and are sensitive to the 
large scale structure of the source while the longer baselines provide the high resolution 
information. GMRT array configuration was designed to have roughly half the anten¬ 
nas in a compact “Central Square” to provide the shorter spacings information, which is 
crucial mapping extended source and large scale structures in the sky. The uc-coverage 
of the central square antennas is shown in Figure 10.5. Notice that there are no mea¬ 
surements for (u = O.v = 0). V((). 0) represents the total integrated flux received by the 
antennas and is absent in the visibility data. Effect of this on the image will be discussed 
later. 

The astronomical coordinates depend on the line of intersection of the ecliptic and 
equatorial planes. The uv- coverage in turn depends on the position of the source in the 
astronomical coordinate system. Since the reference line of the this coordinate system 
changes because of the well known precession of the earth’s rotation axis, the uv- coverage 
also becomes a function of the reference epoch for which the source position is specified. 
For the purpose of comparison and consistence in the literature, all source positions are 
specified in standard epochs (B1950 or J2000). Since each point in the ( u,v,w ) plane 
measures a particular spatial frequency and this spatial frequency coverage differs from 
one epoch to another, it’s necessary to precess the source coordinates to the current 
epoch (also called the “date coordinates”) prior to observations and all processing of the 
visibility data for the purpose of mapping must be done with ( u, v, w) evaluated for the 
epoch of observations. Precessing the visibilities to the standard epoch prior to inverting 
the Eq. 10.2.10 will require specifying the real and imaginary parts of the visibility at 
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Figure 10.2: The (X,Y,Z) co-ordinate system used to specify antenna locations. 


(n. v, w] coordinates which are in fact not measured (since the ur-coverage changes with 
epoch) introducing errors in the mapping procedure. 


10.2 2D Relation Between Sky and Aperture Planes 

Below, we derive the generalized 2D Fourier transform relation between the visibility and 
the source brightness distribution in the («, v, w) system. The geometery for this derivation 
is shown in Fig 10.3. 

Let the vector L a represent the direction of the phase center and the vector D\ repre¬ 
sent the location of all antennas of an array with respect to a reference antenna. Then 
T g = D\.L 0 . Note that 2ttD\.L 0 = 2niu is phase by which the visibility should be rotated to 
stop the fringe. For any source in direction L = L a + a, the output of an interferometer 
after fringe stopping will be 

V{D X ) = J /(Z)B(L)e 27rtBA - (Z " Zo) dfl. (10.2.8) 

47T 

The vector L = (l,m,n) is in the sky tangent plane, L a is the unit vector along the w axis 
and D\ = (u, v, w). The above equation can then be written as 

V(u,v,w)= f f I{l,m)B{l,m)e 2 ^ ul+vm+w{vl - l '- m2 - 1)) dU ™ (10.2.9) 

J J v 1 l TYi. 

If the array is such that all antennas are exactly located in the (w, t;) plane, w is exactly 
zero and the above equation reduces to an exact 2D Fourier transform relation between 
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Celestial Pole 





Figure 10.3: Relationship between the (/. to) co-ordiantes and the (u,v,w) co-ordinates 


the source brightness distribution and the visibility. This is true for a perfect east-west 
array (like WSRT or ATCA). However to maximize the ui i-coverage arrays like GMRT or 
VIA. are not perfectly east-west. As mentioned earlier, the integrals in the above equation 
are finite for a small portion of the sky (being limited by the primary beam patter of the 
antennas). If the field of view being mapped is small (ie. for small l and to) \J\ — l 2 + to 2 - 
1 sa — 1(Z 2 + to 2 ) and can be neglected. Eq. 14.1.1 becomes 


V ( u , v, w) « V (u, v, 0) 


1(1, m)B'(l, m)e 2vL{ul+vm) dldm. 


( 10 . 2 . 10 ) 


where B' = B/\J 1 — l 2 — to 2 . Neglecting the uj-term puts restrictions on the field of view 
that can be mapped without being effected by the phase error which is approximately 
equal to n(l 2 + m 2 )w. Eq. 10.2.10 shows the 2D Fourier transform relation between the 
surface brightness and visibility. 

Since there are finite number of antennas in an aperture synthesis array, the uv- 
coverage is not continuous. Let 

, 1, for all measured (u,v) points ... 

S(u,v) = , , 10.2.11 

0, every where else. 

Then to represent the real life situation, assuming that B(l,m) - 1 over the extent of the 
source, Eq. 10.2.10 becomes 

V(u,v)S(u,v) = j j I(l,m)e 27rL{ul+vrn) dldm. (10.2.12) 

Inverting the above equation and using the convolution theorem, we get I D = I * DB 
where DB is the Fourier transform of S. DB is the transfer function of the the telescope 
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for imaging and is referred to as the Dirty Beam. I D represents the raw image produced 
by an earth rotation aperture synthesis telescope and is referred to as the Dirty Map. 
Contribution of Dirty Beam to the map and methods of removing these these effects will 
be discussed in greater detail in later lectures. 

In all the above discussion, we have assumed the observations are monochromatic 
with negligible frequency bandwidth and that the [u, t>) measurements are instantaneous 
measurements. None of these assumptions are true in real life. Observations for contin¬ 
uum mapping are made with as large a frequency bandwidth as possible (to maximize the 
sensitivity) and the data is recorded after finite integration. Both result into degradation 
in the map plane and these effects will be discussed in the later chapters. 

Neglecting the w-term essentially implies that the source brightness distribution is 
approximated to be restricted to the tangent plane at the phase center in the sky rather 
than on the surface of the celestial sphere. At low frequencies, where the antenna primary 
beams are larger and the radio emission from sources is also on a larger scale, this 
assumption restricts the mappable part of the sky to a fraction of the primary beam. 
Methods to relax this assumption will also be discussed in a later lecture. 


10.3 Further Reading 

1. Interferometry and Synthesis in Radio Astronomy; Thompson, A. Richard, Moran, 
James M., Swenson Jr., George W.; Wiley-Interscience Publication, 1986. 

2. Synthesis Imaging In Radio Astronomy; Eds. Perley, Richard A.,Schwab, Frederic 
R., and Bridle, Alan H.; ASP Conference Series, Vol 6. 
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Saryay Bhatnagar 


11.1 Introduction 

An aperture synthesis array measures the visibilities at discrete points in the ire-domain. 
The visibilities are Fourier transformed to get the Dirty Map and the weighted (/('-sampling 
function is Fourier transformed to get the Dirty Beam using the efficient FFT algorithm. 
This lecture describes the entire chain of data processing required to inverted the visi¬ 
bilities recorded as a function of (u, v,w), and the resulting errors/distortions in the fi¬ 
nal image. In this entire lecture, the V operator represents convolution operation, the 
V operator represents point-by-point multiplication and the operator represents the 
Fourier transform operator. 

As described earlier, the visibility V measured by an aperture synthesis telescope is 
related to the sky brightness distribution I as 

V^I, ( 11 . 1 . 1 ) 

where = denotes the Fourier Transform. The above equation is true only for the case 
of continuous sampling of the (/.((-plane such that V is measured for all values of (//, v). 
However since there are finite antennas in an array, ////-plane is sampled at descreet uv 
points and Eq. 11.1.1 has to be written as 


V.S ^/* DB{= I d ), (11.1.2) 

where I d is the Dirty Map, I is the true brightness distribution, DB is the Dirty Beam and 
S is the (/((-sampling function given by 

S(u, v) = ^ S(u — u,k, v — Vk), (H.1.3) 

k 

where Uk and Vk are the actual (//, v) points measured by the telescope. The pattern of all 
the measured (//, /() points is referred to as the (/((-coverage. 

This function essentially assigns a weight of unity to all measured points and zero 
everywhere else in the (/('-plane. Fourier transform of S is referred to as the Dirty Beam. 
As written in Eq. 11.1.2, the Dirty Beam is the transfer function of the instrument used 
as an imaging device. The shape of the Dirty Beam is a function of the //((-((overage which 
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in turns is a function of the location of the antennas. Dirty Beam for a fully covered un- 
plane will be equal to sin(TrlX/u max )/(Trl\/u max ) where u max is the largest antenna spacing 
for which a measurement is available. The width of the main lobe of this function is 
proportional to A /u max . The resolution of such a telescope is therefore roughly A /u max 
and u m ax can be interpreted as the size of an equivalent lens. For a real uv -coverage 
however, S is not flat till u max and has ‘holes’ in between representing un-sampled ( a, v) 
points. The effect of this missing data is to increase the side-lobes and make the Dirty 
Beam noisy, but in a deterministic manner. Typically, an elliptical gaussian can be fitted 
to the main lobe of the Dirty Beam and is used as the resolution element of the telescope. 
The fitted gaussian is referred to as the Synthesized Beam. 

The Dirty Map is a convolution of the true brightness distribution and the Dirty Beam. 
I d is almost never a satisfactory final product since the side-lobes of I) 11 (which are due 
to missing spacings in the ////-domain) from a strong source in the map will contaminate 
the entire map at levels higher than the thermal noise in the map. Without removing 
the effect of DB from the map, the effective RMS noise in the map will be much higher 
than the thermal noise of the telescope and will result into obscuration of faint sources 
in the map. This will be then equivalent to reduction in the dynamic range of the map. 
The process of De-convolving is discussed in a later lecture, which effectively attempts 
to estimate / from I d such that (/ - I d ) * DB is minimized consistent with the estimated 
noise in the map. 

To use the FFT algorithm for Fourier transforming, the irregularly sampled visibility 
V(u, v) needs to be gridded onto a regular grid of cells. This operation requires interpo¬ 
lation to the grid points and then re-sampling the interpolated function. To get better 
control on the shape of the Dirty Beam and on the signal-to-noise ratio in the map, the 
visibility is first re-weighted before being gridded. These operations are described below. 


11.2 Weighting, Tapering and Beam Shaping 

The shape of the Dirty Beam can be controlled by multiplying S with other weighting 
functions. Note that the measured visibilities already carry a weight which is a measure 
of the signal-to-noise ratio of each measurement. Since there is no control on this weight 
while mapping, it is not explicitly written in any of equations here but is implicitly used. 

Full weighting function W as used in practice is given by 

W(u,v) — TkD k 5(u - u k ,v - v k ). (11.2.4) 

k 

The function T k is the ‘wu-tape ring' function to control the shape of DB and D k is the 
‘density-weighting’ function used in all imaging programs. If S was a smooth function, 
going smoothly to zero beyond the ma xi mum sampled ////-point, DB would also be smooth 
with no side lobes (e.g. if S was a gaussian). However, S is collection of delta functions 
with gaps in between (for the missing ////-points not measured by the telescope) and has a 
sharp cut-off at the limit of (/(’-coverage. This results into DB being a highly non-smooth 
function with potentially large side-lobes. 

As is evident from the plots of ////-coverage, the density of (/(’-tracks decreases away 
from the origin. If one were to use the local average of the (/.(’-points in the ////-plane 
for mapping as is done in the gridding operation described below, the signal-to-noise 
ratio of the points would be proportional to the number of ((/.’-points averaged. Since 
the density of measured ////-points is higher for smaller values of u and v, visibilities for 
shorter spacings get higher weightage in the visibility data effectively making the array 
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relatively more sensitive to the broader features in the sky. The function D k controls the 
weights resulting from non-uniform density of the points in the ww-plane. 

Both T k and D k provide some control over the shape of the Dirty Beam. T k is used 
to weight down the outer edge of the uv -coverage to decrease the side-lobes of DB at the 
expense decreasing the spatial resolution. D k is used to counter the preferential weight 
that the ur-points get closer to the origin at the expense of degrading the signal-to-noise 
ratio. 

T k is a smoothly varying function of [u. c) and is often used as T(u k ,v k ) = T(u k )T{v k ). 
For most imaging applications, T(u k ,v k ) is a circularly symmetric gaussian. However 
other forms are also occasionally used. 

Two forms of D k are commonly used. When l) k = 1 for all values of (u,v), it is referred 
to as ‘natural weighting’ were the natural weighting of the wv-coverage is used as it is. 
This gives best signal-to-nose ratio and is good when imaging weak compact sources but 
is undesirable for extended sources where both large scale and small scale features are 
present. 

When D k = 1/N k where N k is a measure of the local density of uv-points around 
(u k , v k ), it is referred to as ‘uniform weighting’ where an attempt is made to assign uniform 
weights to the entire covered wri-plane. In standard data reduction packages available for 
use currently (AIPS, SDE. Miriad ), while re-gridding the visibilities (discussed below), N k 
is equal the number of wc-points within a given cell in the wr-plane. However it can be 
shown that this can result into serious errors, referred to as catastrophic gridding error 
in some pathological cases. This problem can be handled to some extend by using better 
ways of estimating the local density of wu-points (Briggs, 1995). 

Eq. 11.1.2, using the weighted sampling function W is written as 

(V.S.W) ^ (I*DB). (11.2.5) 

Note that DB ^ S.W, i.e. the Dirty Beam is the Fourier transform of the weighted sam¬ 
pling function. 


11.3 Gridding and Interpolation 

The inversion of the visibilities to make the Dirty Map is done using FFT algorithm which 
requires that the function be sampled at regular intervals and the number of samples 
be power of 2. For the case of mapping the sky using an aperture synthesis telescope, 
this implies that the visibility data be available on a regular 2D grid in the uv plane. 
Thus re-gridding of the data onto a regular grid is required by potentially interpolating 
the visibility to the grid points, since the visibility function V(u, v) is measured at discrete 
points (ii. v ) which are not assured to be at regular intervals along the u and v axis. 

Interpolation of V is done by multiplying a function and averaging all the measured 
points which lie within the range of the function with a finite support base, centered 
at each grid point. The resultant average value is assigned to the corresponding grid 
point. This operation is equivalent to discrete convolution of V with the above mentioned 
function and then sampling this convolution at the grid points. The convolving function 
is referred to as the Gridding Convolution Function (GCF). There are other ways of doing 
this interpolation. However the interpolation in practice is done by convolution since this 
results into predictable results in the map plane which are easy to visualize. Also using 
GCF with finite support base results into each grid point getting the value of the local 
average of the visibilities. 
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After gridding Eq. 11.2.5 becomes 

(' V.S.W)*C^(I*DB).c , (11.3.6) 

where C represents the GCF and c^C. 

The effect of gridding the visibilities on the map is to multiply the map with function c 
and since C has a finite support base (i.e. is of finite extent), c is infinite in extent which 
result into aliasing in the map plane (the other cause of aliasing could be under-sampling 
of the in!-plane). The amplitude of the aliased component from a position (l, to) in the map 
is determined by c(l,m). Ideally therefore, this function should be rectangular function 
with the width equal to the size of the map and smoothly going to zero immediately outside 
the map. However from the point of efficiency of the gridding process, this is not possible, 
and GCF used in practice have a trade-off between the roll-off properties at the edge and 
flatness within the map. 

Since the Dirty Map is multiplied by c, if c is well known, then effect of convolution 
by the GCF can be removed by point-wise division of Dirty Map and Dirty Beam given 
by I d = I d /c and DB = DB/c for later processing, particularly in deconvolution of I d . 
In practice however, this division is not carried out by evaluating c{l,m) over the map. 
Instead, for efficiency purposes, this function is kept in the computer memory tabulated 
with a resolution typically 1/100 times the size of the cell in the image. 

To take the Fourier transform of (V.S.W) * C using the FFT algorithm one needs to 
sample the right hand side of Eq. 11.3.6 by multiplication with the re-sampling function 
R given by 

OO OO 

R(u, w) = E E 5(j — u/Au, k — v/Av), (11.3.7) 

j=— oo k— —oo 

where Aw and Aw are the cell size in the ww-domain. Eq. 11.3.6 then becomes 

R.((V.S.W)*C) ^r*((I*DB).c), (11.3.8) 

where R r. Right hand side of this equation then is the approximation of I d obtained 
in practice. As discussed in earlier lecture, FFT generates a periodic function (due to the 
presence of R in the left hand side of Eq. 11.3.8) and I d represents one period of such 
a function. To map an angular region of sky of size (NiAl, N m Am), using the Nyquist 
sampling theorem we get A/Aw = 1/AZ and N m Av = 1/Am where A l and A to is the cell 
size in the map and Aw and Aw are cell sizes in the ww-plane. 

C is usually real and even and is assumed to be separable as C(u,v) = Ci(w)C , 2 (w). 
Various GCFs used in practice are listed below. Functions listed below are in 1-dimension 
and are truncated (set to zero) for |w| > rnAu/2 where Aw is the size of the grid and to is 
the number of such cells used. 

1. ‘Pillbox’ function 


C{u) 


1, |w| < toAw/2 1 
0, otherwise J 


(11.3.9) 


This amounts to simple averaging of all the ww-points with in the rectangle defined 
by Eq. 11.3.9. However since its Fourier transform is sine with large side lobes, 
it provides poor alias rejection and is a lmost never used but is useful for intuitive 
understanding. 


2. Truncated exponential function 
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C(u) = e^r. (11.3.10) 

Typically m = 6, w = 1 and a = 2 is used and c can be expressed in terms of error 
function. 

3. Truncated sine function 

C(u) = sinc( —- —) . (11.3.11) 

\wAuJ 

For m = 6 and w 1, this is the normal sine function expressed in terms of sin 
function. As m increases, the Fourier transform of this function approaches a step 
function which is constant over the map and zero outside. 

4. Sine exponential function 

C(u) = e”i Au sine ^^ . (11.3.12) 

For m = 6, wi = 2.52, w 2 = 1.55, a = 2, the above equation reduces to multiplication 
of gaussian with the exponential function. This optimizes between the flat response 
of exponential within the map and suppression of the side-lobes due the presence of 
the gaussian. 

5. Truncated spheroidal function 


C(u) = \1 - rj 2 (u)\ 0 ‘(f> 0: o(nm/2.r)(u)), (11.3.13) 

where (f> a0 is the 0-order spheroidal function, //(«,) = 2u/mAu and a > —1. 

Of all the square integrable functions, this is the most optimal in the sense that 
it has maximum contribution to the normalized area from the part of c(l) which is 
with in the map. This is referred to as the energy concentration ratio expressed as 


f \c(l)\ 2 dl 

J map 1 v / i • 


f \c(l)\ 2 dl 


is ma xi mized. 


11.4 Bandwidth Smearing 

The effect of a finite bandwidth of observation as seen by the multiplier in the correlator, 
is to reduce the amplitude of the visibility by a factor given by 

sin{irlAv/n a 9) / {jrlAv/v o 0), where 9 is angular size of the synthesized beam, v a is the center 
of the observing band, l is location of the point source relative to the field center and An 
is the bandwidth of the signal being correlated. 

The distortion in the map due to the finite bandwidth of observation can be visualized 
as follows. For continuum observations, the visibility data integrated over the bandwidth 
An is treated as if the observations were made at a single frequency the central fre¬ 
quency of the band. As a result the u and v co-ordinates and the value of visibilities are 
correct only for n„. The true co-ordinate at other frequencies in the band are related to 
the recorded co-ordinates as 

(n 0 u u n 0 v v \ 


(11.4.14) 
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Since the total weights W used while mapping does not depend on the frequency, the 
relation between the brightness distribution and visibility at a frequency v becomes 


V(u,v) = V 



(11.4.15) 


Hence the contribution of V(u,v) to the brightness distribution get scaled by (v/v 0 f 
and the co-ordinates gets scaled by (i//i/ Q ). The effect of the scaling of the co-ordinates, 
assuming a delta function for the Dirty Beam, is to smear a point source at position (/, to) 
into a line of length (A v/v 0 )\Jl 2 + m 2 in the radial direction. This will get convolved with 
the Dirty Beam and the total effect can be found by integrating the brightness distribution 
over the bandwidth as given in Eq. 11.4.15 


I d (l, m) 


- oo 

I \HrfV )\ 2 1 
0 

(*) 

2 

l 1 1 

( Us_ mis ) 

’ Vo J 

I dv 

oo 

f \H RF (u)\ 2 du 

L o 



* DB 0 (l, to), 


(11.4.16) 


where H RF (y) is the band-shape function of the RF band and DB a is the Dirty Beam at 
frequency v a . If one represents the synthesized beam as a gaussian function of standard 
deviation a b = 9 b /V8ln2 and the bandpass represented by a rectangular function of width 
A;/, the fractional reduction in the strength of a source located at a radial distance r = 
Vp + to 2 is given by 

R b = 1.064^-er/ fo.833^V (11.4.17) 

rAv \ 0 b v o ) 

Eq. 11.4.16 is equivalent to averaging large number of maps made from monochro¬ 
matic visibilities at v. Since each of such maps would scale by a different factor, the 
source away from the center would move along the radial line from one map to another, 
producing the radial smearing convolved with the Dirty Beam Since the source away 
from the center is elongated radially, its side-lobes (because of the Dirty Beam) will also 
be elongated in the radial direction. As a result the side-lobes of distant sources will be 
elongated at the origin but not towards 90° angle from the vector joining the source and 
the origin. 

The effect of bandwidth smearing can be reduced if the RF band is split into frequency 
channels with smaller channel widths. This effectively reduces the Av as seen by the 
mapping procedure and while gridding the visibilities then, the u and v can be computed 
separately for each channel and assigned to the correct uv-cell. The FX correlator used 
in GMRT provides up to 128 frequency channels over the bandwidth of observation. 


11.5 Time Average Smearing 

As discussed before, the u and v co-ordinates of an antenna are a function of time and 
continuously change as earth rotates generating the uv- coverage. To improve the signal- 
to-noise ratio as well as reduce the data volume, the visibility function V(u , v) is recorded 
after finite integration in time (typically 10-20s for imaging projects) and the average 
value of the real and imaginary parts of V are used for average values of a and v over the 
integration time. Effectively then, the assigned values of u and v for each visibility point is 
evaluated for a time which is wrong from the correct (instantaneous) time by a ma xi mum 
of r/2 where r is the integration time. 

In the map domain, the resulting effect can be visualized by treating the resulting 
map from the time average visibilities as the average for a number of maps made from the 
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instantaneous (un-averaged) visibilities. The baseline vectors in the un-domain follow the 
loci of the un-tracks (which are parabolic tracks) and rotate at an angular velocity equal 
to the that of earth, uj e . Since a rotation of one domain results into a rotation by an equal 
amount in the conjugate domain in a Fourier transform relation, the effect in the map 
domain is that the instantaneous maps also are rotated with respect to each other, at the 
rate of u> e . Hence, a point source located at (l,m) away from the center of the map would 
get smeared in the azimuthal direction. This effect is same as the smearing effect due to 
finite bandwidth of observations, but in an orthogonal direction. 


11.6 Zero-spacing Problem 

Since visibility and the brightness distribution are related via a Fourier transform, V{{), 0) 
measures the total flux from the sky. However, since the difference between the antenna 
positions is always finite, F(0,0) is never measured by an interferometer. For a point 
source, it is easy to estimate this value by extrapolation from the smallest u and v for 
which a measurement exist, since V as a function of baseline length is constant. However 
for an extended source, this value remains unknown and extrapolation is difficult. 

For the purpose of understanding the effect of missing zero-spacings, we can multiply 
the visibility in Eq. 11.3.6 by a rectangular function which is 0 around (u, v) = (0,0) and 
1 elsewhere. In the map domain then, the Dirty Map gets convolved with the Fourier 
transform of this function, which has a central negative lobe. As a result, extended 
sources will appear to be surrounded by negative brightness in the map which cannot 
be removed by any processing. This can only be removed by either estimating the zero¬ 
spacing flux while restoring I from I d or V, or by supplying the zero-spacing flux as an 
external input to the mapping/deconvolution programs. The Ma xi mum Entropy class of 
image restoration algorithms attempt to estimate the zero-spacing flux, while the CLEAN 
class of image restoration algorithms needs to be supplied this number externally. Both 
these will be discussed in the later lectures. 

11.7 Further Reading 

1. Interferometry and Synthesis in Radio Astronomy;Thompson, A.Richard, Moran, 
James M., Swenson Jr., George W.; Wiley-Interscience Publication, 1986. 

2. Synthesis Imaging In Radio Astronomy; Eds. Perley, Richard A.,Schwab, Frederic 
R., and Bridle, Alan H.; ASP Conference Series, Vol 6. 

3. High Fidelity Deconvolution of Moderately Resolved Sources; Briggs Daniel; Ph.D. 
Thesis, 1995, The New Maxico Institute of Mining and Technology, Socorro, New 
Mexico, USA. 
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Chapter 12 

Deconvolution in sythesis 
imaging-an introduction 


Rajaram Nityananda 


12.1 Preliminaries 

These lectures describe the two main tools used for deconvolution in the context of radio 
aperture synthesis. The focus is on the basic issues, while other lectures at this school 
will deal with aspects closer to the actual practice of deconvolution. The practice is 
dominated by the descendants of a deceptively simple-looking , beautiful idea proposed 
by J. Hogbom (A&A Suppl. 15 417 1974), which goes by the name of CLEAN. About the 
same time, another, rather different and perhaps less intuitive idea due to the physicist 
E.T. Jaynes was proposed by J.G. Abies (A&A Suppl 15 383 1974) for use in astronomy. 
This goes by the name of the Ma xi mum Entropy Method, MEM for short. MEM took a 
long time to be accepted as a practical tool and even today is probably viewed as an exotic 
alternative to CLEAN. We will see, however, that there are situations in which it is likely to 
do better, and even be computationally faster. The goal of these lectures is to give enough 
background and motivation for new entrants to appreciate both CLEAN and MEM and go 
deeper into the literature. 


12.2 The Deconvolution Problem 

12.2.1 Interferometric Measurements 

An array like the GMRT measures the visibility function V(u,v) along baselines which 
move along tracks in the u — v plane as the earth rotates, For simplicity, let us assume 
that these measurements have been transferred onto a discrete grid and baselines are 
measured in units of the wavelength. The sky brightness distribution in the field 

of view is a function of l,m which are direction cosines of a unit vector to a point on the 
celestial sphere referred to the u and v axes. The basic relationship between the measured 
visibility function V and the sky brightness I is a Fourier transform. 


V(u,v) 


1(1, m) exp(— 2m(lu + mv)) dl dm. 
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This expression also justifies the term “spatial frequency” to describe the pair (u,v), since 
u and v play the same role as frequency plays in representing time varying signals. 

Many things have been left out in this expression, such as the proper units, polarisa¬ 
tion, the primary beam response of the individual antennas, the non-coplanarity of the 
baselines, the finite observing bandwidth, etc. But it is certainly necessary to understand 
this simplified situation first, and the details needed to achieve greater realism can be 
put in later. 

Aperture synthesis, as originally conceived, involved filling in the u — v plane without 
any gaps upto some maximum baseline b max which would determine the angular resolu¬ 
tion. Once one accepts this resolution limit, and writes down zeros for visibility values 
outside the measured circle, the Fourier transform can be inverted. One is in the happy 
situation of having as many equations as unknowns. A point source at the field cen¬ 
tre, (which has constant visibility) would be reconstructed as the Fourier transform of a 
uniformly filled circular disk of diameter 2b max . This is the famous Airy pattern with its 
first zero at 1.22/(2 bmax)- The baseline b is already measured in wavelengths, hence the 
missing A in the numerator. But even in this ideal situation, there are some problems. 
Given an array element of diameter D (in wavelengths again!), the region of sky of interest 
could even be larger than a circle of angular diameter 2/D. A Fourier component describ¬ 
ing a fringe going through one cycle over this angle corresponds to a baseline of D/2. But 
measuring such a short baseline would put two dishes into collision, and even somewhat 
larger baselines than D run the risk of one dish shadowing the other. In addition, the 
really lowest Fourier component corresponds to ( u , v) = (0,0), the total flux in the primary 
beam. This too is not usually measured in synthesis instruments Thus, there is an in¬ 
evitable “short and zero spacings problem” even when the rest of the u — v plane is well 
sampled. 

12.2.2 Dirty Map and Dirty Beam 

But the real situation is much worse. With the advent of the Very Large Array (VLA), 
the majestic filling in of the u — v plane with samples spaced at D/2 went out of style. If 
one divides the field of view into pixels of size 1/(2 b max ), then the total number of such 
pixels (resolution elements) would be significantly larger than the number of baselines 
actually measured in most cases. This is clearly seen in plots of u — v coverage which 
have conspicuous holes in them. The inverse Fourier transform of the measured visibility 
is now hardly the true map because of the missing data. But it still has a name - the 
“dirty map” I D . We define a sampling cum weighting function IT (it, v) which is zero where 
there are no measurements and in the simplest case (called uniform weighting) is just 
unity wherever there are measurements. So we can get our limited visibility coverage 
by taking the true visibilities and multiplying by W(u,v). This multiplication becomes a 
convolution in the sky domain. The “true” map with full visibility coverage is therefore 
convolved by the inverse Fourier transform of W which goes by the name of the “dirty 
beam” B D (l,m). 

I D (l,m) = j J 1(1',m')B D (l — l’,m. — m!) dl' dm! 

where 

B D (l, m) oc E W(u, v ) exp(+2ni(lu + mv)). 

For a patchy u — v coverage, which is typical of many synthesis observations, B° has 
strong sidelobes and other undesirable features. This makes the dirty map difficult to 
interpret. What one sees in one pixel has contributions from the sky brightness in neigh¬ 
bouring and even not so neighbouring pixels. For the case of IT = 1 within a disk of 
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radius b max we get an Airy pattern as mentioned earlier. This is not such a dirty beam 
after all, and could be cleaned up further by making the weighting non-uniform, i.e. ta¬ 
pering the function W down to zero near the edge |(u,t>)| = b max . For example, if this 
weighting is approximated by a Gaussian, then the sky gets convolved by its transform, 
another Gaussian. This dirty map is now related to the true one in a reasonable way. 
But, as Abies remarked, should one go to enormous expense to build and measure the 
longest baseline and then multiply it by zero? 

12.2.3 The Need for Deconvolution 

Clearly, there has to be a better way than just reweighting the data to make the dirty beam 
look better, (and fatter, incidentally, since one is suppressing high spatial frequencies), 
But this better way has to play the dangerous game of interpolating (for short spacings 
and for gaps in the u — v plane) and extrapolating (for values beyond the largest baseline) 
the visibility function which was actually measured. The standard terminology is that 
the imaging problem is “underdetermined” or “ill-posed” or “ill-conditioned”. It has fewer 
equations than unknowns. However respectable we try to make it sound by this termi¬ 
nology, we are no better than someone solving x + y = 1 for both x and y\. Clearly, some 
additional criterion which selects one (or a few) solutions out of the infinite number pos¬ 
sible has to be used. The standard terminology for this criterion is “a priori information”. 
The term “a priori” was used by the philosopher Kant to describe things in the mind that 
did not seem to need sensory input, and is hence particularly appropriate here. 

One general statement can be made. If one finds more than one solution to a given 
deconvolution problem fitting a given data set, then subtracting any two solutions should 
give a function whose visibility has to vanish everywhere on the data set. Such a bright¬ 
ness distribution, which contains only unmeasured spatial frequencies, is appropriately 
called an “invisible distribution”. Our extra- /inter- polation problem consists in finding 
the right invisible distribution to add to the visible one! 

One constraint often mentioned is the positivity of the brightness of each pixel. To 
see how powerful this can be, take a sky with just one point source at the field centre. 
The total flux and two visibilities on baselines (D/2,0), (0, D/2) suffice to pin down the 
map completely. The only possible value for all the remaining visibilities is equal to 
these numbers, which are themselves equal. One cannot add any invisible distribution to 
this because it is bound to go negative somewhere in the vast empty spaces around our 
source. But this is an extreme case. The power of positivity diminishes as the field gets 
filled with emisssion. 

Another interesting case is when the emission is known to be confined to a window 
in the map plane. Define a function w(l,m) = 1 inside the window and zero outside. 
Let w(u,v) be its Fourier transform. Multiplying the map by w makes no difference. In 
Fourier space, this condition is quite non-trivial, viz V(u,v) = V(u,v) * w(u. v). Notice how 
the convolution on the right transfers information from measured to unmeasured parts 
of the u — v plane, and couples them. 

12.3 CLEAN 

12.3.1 The Hogbom Algorithm 

Consider a sky containing only isolated point sources. In the dirty map, each appears as 
a copy of the dirty beam, centred on the source position and scaled by its strength. How¬ 
ever, the ma xi ma in the map do not strictly correspond to the source positions, because 
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each maximum is corrupted by the sidelobes of the others, which could shift it and alter 
its strength. The least corrupted, and most corrupting, source is the strongest. Why not 
take the largest local ma xi mum of the dirty map as a good indicator of its location and 
strength? And why not subtract a dirty beam of the appropriate strength to remove to a 
great extent the bad effects of this strongest source on the others? The new ma xi mum 
after the subtraction now has a similar role. At every stage, one writes down the co¬ 
ordinates and strengths of the point sources one is postulating to explain the dirty map. 
If all goes well, then at some stage nothing (or rather just the inevitable instrumental 
noise) would be left behind. We would have a collection of point sources, the so called 
CLEAN components, which when convolved with the dirty beam give the dirty map. 

One could exhibit this collection of point sources as the solution to the deconvolution 
problem, but this would be arrogant, since one has only finite resolution. As a final ges¬ 
ture of modesty, one replaces each point source by (say) a gaussian, a so called “CLEAN” 
beam, and asserts that the sky brightness, convolved with this beam, has been found. 

This strategy, which seems so reasonable today, was a real breakthrough in 1974 
when proposed by J. Hogbom. Suddenly, one did not have to live with sidelobes caused 
by incomplete u — v coverage. In fact, the planning for new telescopes like the VLA must 
have taken this into account- one was no longer afraid of holes. 

12.3.2 The Behaviour of CLEAN 

With hindsight, one can say that the initial successes were also due to the simplicity 
of the sources mapped. It is now clear that one should not be applying this method to 
an extended source which covered several times the resolution limit (the width of the 
central peak of the dirty beam). Such a source could have a broad, gentle ma xi mum in 
the dirty map, and subtracting a narrow dirty beam at this point would generate images 
of the sidelobes with the opposite sign. This would generate new ma xi ma where new 
CLEAN components would be placed by the algorithm, and things could go unstable. 
One precaution which certainly helps is the “gain factor” (actually a loss factor since it 
is less than one). After finding a maximum, one does not subtract the full value but a 
fraction g typically 0.2 or less. In simple cases, this would just make the algorithm slower 
but not change the solution. But this step actually helps when sources are more complex. 
One is being conservative in not fully believing the sources found initially. This gives the 
algorithm a chance to change its mind and look for sources elsewhere. If this sounds 
like a description of animal behaviour, the impression being conveyed is correct. Our 
understanding of CLEAN is largely a series of empirical observations and thumb rules, 
with common sense rationalisations after the fact, but no real mathematical theory. One 
exception is the work of Schwarz (A&A 65 345 1978) which interpreted each CLEAN 
subtraction as a least squares fit of the current dirty map to a single point source. This 
is interesting but not enough. CLEAN carries out this subtraction sequentially, and that 
too with a gain factor. In principle, each value of the gain factor could lead to a different 
solution, i.e a different collection of CLEAN components, in the realistic case when the 
number of u — v points is less than the number of resolution elements in the map. So 
what are we to make of the practical successes of CLEAN? Simply that in those cases, the 
patch of the sky being imaged had a large enough empty portion that the real number of 
CLEAN components neeeded was smaller than the number of data points available in the 
u — v plane. Under such conditions, one could believe that the solution is unique. Current 
implementations of CLEAN allow the user to define “windows” in the map so that one does 
not look for CLEAN components outside them. But when a large portion of the field of 
view has some nonzero brightness, there are indeed problems with CLEAN. The maps 
show spurious stripes whose separation is related to unmeasured spatial frequencies 
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(that’s how one deduces they are spurious). One should think of this as a wrong choice 
of invisible distribution which CLEAN has made. Various modifications of CLEAN have 
been devised to cope with this, but the fairest conclusion is that the algorithm was never 
meant for extended structure. Given that it began with isolated point sources it has done 
remarkably well in other circumstances. 

12.3.3 Beyond CLEAN 

Apart from the difficulties with extended sources, CLEAN as described above is an inher¬ 
ently slow procedure. If N is the number of pixels, subtracting a single source needs of 
the order of N operations. This seems a waste when this subtraction is a provisional, 
intermediate step anyway! B.G. Clark had the insight of devising a faster version, which 
operates with a truncated dirty beam, but only on those ma xi ma in the map strong 
enough that the far, weak sidelobes make little difference. Once these sources have been 
identified by this rough CLEAN (called a “minor cycle”), they are subtracted together from 
the full map using an fast fourier transform (FFT) for the convolution, which takes only 
A log A operations. This is called the “major cycle”. The new residual map now has a new 
definition of “strong” and the minor cycle is repeated. 

A more daring variant, due to Steer, Dewdney, and Ito, (hence SDI CLEAN) carries out 
the minor cycle by simply identifying high enough maxima, without even using CLEAN, 
which is kept for the major cycle. Other efforts to cope with extended sources go under 
the name of “multiresolution CLEAN”. One could start with the inner part of the u—v plane 
and do a CLEAN with the appropriate, broader dirty beam. The large scale structure thus 
subtracted will hopefully now not spoil the next stage of CLEAN at a higher resolution, i.e 
using more of the u — v plane. 


12.4 Maximum Entropy 

12.4.1 Bayesian Statistical Inference 

This method, or class of methods, is easy to describe in the framework of an approach 
to statistical inference (i.e all of experimental science?) which is more than two hundred 
years old, dating from 1763! Bayes Theorem about conditional probabilities states that 

P(A\B)P(B) = P(B\A)P(A) = P(A, B). 

As a theorem, it is an easy consequence of the definitions of joint probabilities (denoted 
by P(A,B)), conditional probabilities (denoted by P(A\B)) and marginal or unconditional 
probabilities (denoted by P(A)). In words, one could say that the fraction of trials A and 
B both happen (P(A. B)) is the product of (i) the fraction of trials in which A happens 
[P(A)) irrespective of B, and (ii) the further fraction of /1-occurences which are also B- 
occurences (P(B\A)). The other form for P(A\B) follows by interchanging the roles of A 
and B. 

The theorem acquires its application to statistical inference when we think of A as a 
hypothesis which is being tested by measuring some data B. In real life, with noisy and 
incomplete data, we never have the luxury of measuring A directly, but only something 
depending on it in a nonunique fashion. If we understand this dependence, i.e under¬ 
stand our experiment, we know P(B\A). If only, (and this is a big IF!), someone gave 
us P(A), then we would be able to compute the dependence of P(A\B) on A from Bayes 
theorem. 


P(A\B) = P(B\A)P{A)/P{B). 
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Going from P{B\A) to P(A\B) may not seem to be a big step for a man, but it is a giant 
step for mankind. It now tells us the probability of different hypotheses A being true 
based on the given data B. Remember, this is the real world. More than one hypothesis 
is consistent with a given set of data, so the best we can do is narrow down the possibil¬ 
ities. (If “hypothesis” seems too abstract, think of it as a set of numbers which occur as 
parameters in a given model of the real world) 

12.4.2 MEM Images 

Descending now from the sublime to aperture synthesis, think of A as the true map and 
B as the dirty map, or equivalently its Fourier transform, the set of measured visibilities. 
We usually want a single map, not a probability distribution of A. So we need the further 
step of maximising P(A\B) with respect to A. All this is possible if P(A) is available for a 
given true map 1(1,m). One choice, advocated by Gull and Daniell in 1978, was to take 


log P({I(l,m)}) cx. — J J 1(1, in) In 1(1, m) dl dm. 

The curly brackets around I on the left side are meant to remind us that the entropy 
is a single number computed from the entire information about the brightness, i.e the 
whole set of pixel values. Physicists will note that this expression seems inspired by 
Boltzmann’s formula for entropy in statistical mechanics, and communication engineers 
will see the influence of Shannon’s concept of information. It was E.T. Jaynes writing in 
the Physical Review of 1957 who saw a vision of a unified scheme into which physics, 
communication theory, and statistical inference would fall (with the last being the most 
fundamental!). In any case, the term “entropy” for the logarithm of the prior distribution 
of pixel values has stuck. One can see that if the only data given was the total flux, then 
the entropy as defined above is a maximum when the flux is distributed uniformly over 
the pixels. This is for the same reason that the Boltzmann entropy is ma xi mised when 
a gas fills a container uniformly. This is the basis for the oft-heard remark that MEM 
produces the flattest or most featureless map consistent with the data - a statement we 
will see requires some qualification. But if one does not want this feature, a modified 
entropy function which is the integral over the map of — /ln(/// d ) is defined. I d (l,m) is 
called a “default image”. One can now check that if only total flux is given the entropy is 
a ma xi mum for / cx I d . 

The selection of a prior is, in my view, the weakest part of Bayesian inference, so 
we will sidestep the debate on the correct choice. Rather, let us view the situation as 
an opportunity, a license to explore the consequences of different priors on the “true” 
maps which emerge. This is easily done by simulation - take a plausible map, Fourier 
transform, sample with a function W so that some information is now missing, and use 
your favourite prior and ma xi mise “entropy” to get a candidate for the true map. It is 
this kind of study which was responsible for the great initial interest in MEM. Briefly, 
what MEM seemed to do in simple cases was to eliminate the sidelobes and even resolve 
pairs of peaks which overlapped in the true map, i.e it was sometimes “better” than the 
original! This last feature is called superresolution, and we will not discuss this in the 
same spirit of modesty that prompted us to use a CLEAN beam. Unlike CLEAN, MEM did 
not seem to have a serious problem with extended structure, unless it had a sharp edge 
(like the image of a planet). In this last case, it was found that MEM actually enhanced 
the ripples near the edge which were sitting at high brightness levels; though it controlled 
the ripples which were close to zero intensity. This is perhaps not surprising if one looks 
at the graph of the function = /In/. There is much more to be gained by removing ripples 
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near 7 = 0 than at higher values of I, since the derivative of the function is higher near 

1 = 0 . 

Fortunately, these empirical studies of the MEM can be backed up by an analyti¬ 
cal/graphical argument due to Ramesh Narayan, which is outlined below. The full conse¬ 
quences of this viewpoint were developed in a review article (Annual review of Astronomy 
and Astrophysics 24 127 1986), so they will not be elaborated here, but the basic reason¬ 
ing is simple and short enough. Take the expression for the entropy, and differentiate it 
with respect to the free parameters at our disposal, namely the unmeasured visibilities, 
and set to zero for maximisation. The derivative of the entropy taken with respect to a 
visibility V(u',v') is denoted by M(u',v'). The understanding is that u',v' have not been 
measured. The condition for a maximum is 

M(u',v') = J j(—l — hi(I(l,m))exp(+2m(lu' + mv') dldm = 0. 

This can be interpreted as follows. The logarithm of the brightness is like a dirty map, i.e it 
has no power at unmeasured baselines, and hence has sidelobes etc. But the brightness 
/ itself is the exponential of this “band limited function” (i.e one with limited spatial 
frequency content). Note first of all that the positivity constraint is nicely implemented- 
exponentials are positive. Since the exponential varies rather slowly at small values of 
I, the ripples in the “baseline” region between the peaks are suppressed. Conversely, 
the peaks are sharpened by the steep rise of the exponential function at larger values of 
I. One could even take the extreme point of view that the MEM stands unmasked as a 
model fitting procedure with sufficient flexibility to handle the cases usually encountered. 
Hogbom and Subrahmanya independently emphasised very early that the entropy is just 
a penalty function which encourages desirable behaviour and punishes bad features in 
the map (LAU Colloq. 49, 1978). Subrahmanya’s early work on the deconvolution of lunar 
occultation records at Ooty (TIFR thesis, 1977) was indeed based on such penalties. 

More properties of the MEM solution are given in the references cited earlier. But one 
can immediately see that taking the exponential of a function with only a limited range 
of spatial frequencies (those present in the dirty beam) is going to generate all spatial 
frequencies, i.e., one is extrapolating and interpolating in the u — v plane. It is also clear 
that the fitting is a nonlinear operation because of the exponential. Adding two data 
sets and obtaining the MEM solution will not give the same answer as finding the MEM 
solution for each separately and adding later! A little thought shows that this is equally 
true of CLEAN. 

If one has a default image I d in the definition of the entropy function, then the same 
algebra shows that I/I d is the exponential of a band-limited function. This could be 
desirable. For example, while imaging a planet, if the sharp edge is put into I d , then 
the MEM does not have to do so much work in generating new spatial frequencies in the 
ratio III d . The spirit is similar to using a window to help CLEAN find sources in the right 
place. 


12.4.3 Noise and Residuals 

The discussion so far has made no reference to noise in the interferometric measure¬ 
ments. But this can readily be accomodated in the Bayesian framework. One now treats 
the measurements not as constraints but as having a Gaussian distribution around the 
“true” value which the real sky would Fourier transform to. Thus the first factor P(B\A) 
on the right hand side of Bayes theorem would now read 

P(B\A) = n ex P(-(/ J m)exp{—2ni(lu + mv)) dl dm — V m (u, v)\ 2 /2a^ v . 
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The product is over measured values of u, v. A nice feature of the gaussian distribution is 
that when we take its logarithm, we get the sum of the squares of the residuals between 
the model predictions (the integral above) and the measurements V m (u,v) - also known 
as “chi-squared” or y 2 . The logarithm of the prior is of course the entropy factor. So, in 
practice, we end up maximising a linear combination of the entropy and y 2 , the latter 
with a negative coefficient. This is exactly what one would have done, using the method 
of Lagrange multipliers, if we were maximising entropy subject to the constraint that the 
residuals should have the right size, predicted by our knowledge of the noise. 

All is not well with this recipe for handling the noise. The discrepancy between the 
measured data and the model predictions can be thought of as a residual vector in a 
multidimensional data space. We have forced the length to be right, but what about the 
direction? True residuals should be random, i.e the residual vector should be uniformly 
distributed on the sphere of constant y 2 . But since we are maximising entropy on this 
sphere, there will be a bias towards that direction which points along the gradient of the 
entropy function. This shows in the maps as a systematic deviation tending to lower the 
peaks and raise the “baseline” i.e the parts of the image near zero I. To lowest order, this 
can be rectified by adding back the residual vector found by the algorithm. This does not 
take care of the invisible distribution which the MEM has produced from the residuals, 
but is the best we can do. Even in the practice of CLEAN, residuals are added back for 
similar reasons. 

The term “bias” is used by statisticians to describe the following phenomenon. We 
estimate some quantity, and even after taking a large number of trials its average is 
not the noise-free value. The noise has got “rectified” by the non-linear algorithm and 
shows itself as a systematic error. There are suggestions for controlling this bias by 
imposing the right distribution and spatial correlations of residuals. These are likely to 
be algorithmically complex but deserve exploration. They could still leave one with some 
subtle bias since one cannot really solve for noise. But to a follower of Bayes, bias is not 
necesarily a bad thing. What is a prior but an expression of prejudice? Perhaps the only 
way to avoid bias is to stop with publishing a list of the measured visibility values with 
their errors. Perhaps the only truly open mind is an empty mind! 


12.5 Further Reading 

1. R.A. Perley, F.R. Schwab, & A.H. Bridle, eds., ‘Synthesis Imaging in Radio Astron¬ 
omy’, ASP Conf. Series, vol. 6. 

2. Thompson, R.A., Moran, J.M. & Swenson, G.W. Jr., ‘Interferometry & Synthesis in 
Radio Astronomy’, Wiley Interscience, 1986. 

3. Steer, D.G., Dewdney, P.E. & Ito, M.R., “Enhancements to the deconvolution algo¬ 
rithm’CLEAN’”, 1984,A&A, 137,159. 



Chapter 13 


Spectral Line Observations 


K. S. Dwarakanath 

This chapter is intended as an introduction to spectral line observations at radio wave¬ 
lengths. While an attempt will be made to put together most of the relevant details, it is 
not intended to be an exhaustive guide to spectral line observations but instead focuses 
more on the basics of spectral line observations, keeping in mind synthesis arrays like 
the Giant Meterwave Radio Telescope (GMRT). 


13.1 Spectral Lines 

Spectral lines originate under a variety of circumstances in Astronomy. The most ubiq¬ 
uitous element in the Universe, the Hydrogen atom, gives rise to the 21-cm-line (// ~ 
1420.405 MHz) due to a transition between the hyperfine levels of its ground state. If the 
Hydrogen atom is ionized, subsequent recombinations of electrons and protons lead to a 
series of recombination lines of the Hydrogen atom. It is easy to see that such transitions 
between higher Rydberg levels give rise to spectral lines at radio wavelengths. Transitions 
around Rydberg levels of 280, for e.g., give rise to recombination lines at v ~ 300 MHz. 
In cold (kinetic temperature ~ 100 K), and dense (~ 1000 cm -3 ) environments Hydrogen 
atoms form molecules. The CO molecule which has been used as a tracer of molecu¬ 
lar Hydrogen has a rotational transition at v ~ 115 GHz. These are a few illustrative 
examples. 

The widths of spectral lines arise due to different mechanisms. One such is the 
Doppler effect. The particles in a gas have random motions corresponding to the ki¬ 
netic temperature of the gas. The observed frequency of the line is thus different from 
the rest frequency emitted by the particles. In a collision-dominated system, the number 
density of particles as a function of velocity is expected to be a Maxwellian distribution. 
The width of this distribution will result in a corresponding broadening of the observed 
spectral line due to Doppler Effect. This width, arising due to the temperature of the gas, 
is called thermal broadening. In addition to the thermal motion of the particles, there 
can also be turbulent velocities associated with macroscopic gas motions. These motions 
are often accounted for by an effective Doppler width, which includes both thermal and 
turbulent broadening, assuming a gaussian distribution for the turbulent velocities also. 
Another mechanism which can contribute to the line width is pressure broadening. This 
arises due to collisions and is particularly relevant in high density environments and/or 
for lines arising through transitions between high Rydberg levels. In addition, there is 
always a natural width to the spectral line imposed by the uncertainty principle, but it is 
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almost always overwhelmed by that due to the mechanisms mentioned earlier. 

An observed spectral feature can be much wider than that expected on the basis of 
the above mentioned mechanisms. This is usually due to systematic motion of the gas 
responsible for the spectral feature like, for e.g., rotation of a gas cloud, expansion of a 
gas cloud, differential rotation of a galaxy, etc.. 


13.2 Rest Frequency and Observing Frequency 


The rest frequency of a spectral line of interest can be calculated if it is not already 
tabulated. The apparent frequency (or, the observing frequency), however, needs to be 
calculated for each source since it depends on the relative velocity between the source 
and the observer. The observed frequency (//„) of a given transition is related to the rest 
frequency of the line (//;) and the radial velocity of the source w.r.t the observer (v, ) as 
(;// - v 0 )=i/ 0 v r /c, where, c is the velocity of light. This relation is valid for v r <C c, and 
9 <C 7r/2, where 9 is the angle between the velocity vector and the radiation wave vector. 
The radial velocity is positive if the motion is away from the observer and the observed 
frequency is smaller than the rest frequency of the line. In this situation, the line is 
redshifted. If the velocity (v r ) is known, the observing frequency can be calculated. While 
dealing with extragalactic systems, one quotes the redshift rather than the radial velocity. 
The redshift (z) is related to the rest and observed frequencies as z = (vi - u 0 )/u 0 and 
approximates to v r /c for v r <C c. 

It is more useful, and common to define velocities w.r.t. the 'local standard of rest' 
than w.r.t. an arbitrary frame of reference. This transformation takes into account the 
radial velocity corrections due to the rotation of the earth about its own axis, the revo¬ 
lution of the earth around the Sun, and the motion of the Sun w.r.t. the local group of 
stars. The magnitudes of these corrections are within ~ 1 km s _1 , 30 km s -1 , and 20 
km s -1 respectively. The actual value of the total correction depends on the equatorial 
coordinates of the source, the ecliptic coordinates of the source, the longitude of the Sun, 
the hour angle of the source, and the geocentric latitude of the observer. 

In principle, the apparent frequency of a spectral line from a source is always changing 
due to the change in the radial velocity between the source and the observer. In a given 
observing session during a day the source can be observed from rise to set. During 
this period the radial component of the velocity between the source and the earth due 
to the rotation of the earth can (in an extreme case) change from -0.465 to +0.465 km 
s -1 . Consider observing a narrow spectral line (width ~ 0.5 km s -1 ) from this source 
using a spectral resolution ~ 0.1 km s -1 . If no extra precautions are taken, the peak 
of the spectral line will appear to slowly drift across the channels during the course of 
the day. This drift, if not accounted for, will decrease the signal-to-noise ratio of the 
line, and increase its observed width in the time-averaged spectrum. Depending on the 
circumstances, this can completely wash out the spectral line. In order to overcome 
this, the continuous change in the apparent frequency is to be corrected for during an 
observing session so that the spectral line does not drift across frequency but stays in 
the same channels. This process of correction is known as Doppler Tracking. I would 
like to emphasize that this is important if one is observing narrow lines with high spectral 
resolution and that there is a significant change in the sight-line component of the earth’s 
rotation during the observing session. 



13.3. SETTING THE OBSERVING FREQUENCY AND THE BANDWIDTH 


3 


13.3 Setting the Observing Frequency and the Bandwidth 

Once the apparent frequency v a of the transition of interest is known, the Local Oscillator 
(LO) frequencies can be tuned to select this frequency for observations. In general, there 
can be more than one LO that need to be tuned. Consider the situation at the GMRT. The 
First LO (vilo) can be chosen such that v IL o = v 0 ± vif. where, v IF is the Intermediate 
Frequency (IF). The First LO can be tuned in steps of 5 MHz. The IF is 70 MHz. The 
IF bandwidth ( 5u IF ) can be chosen from one of 6, 16, and 32 MHz. Thus, the output of 
the first mixer will be over a frequency range of u IF ± dis IF / 2. The baseband LO (v B blo) 
can be tuned in the range of 50 to 90 MHz in steps of 100 Hz to bring the IF down to 
the baseband. The bandwidth of the baseband filter (5v BB ) can be chosen from 62.5 KHz 
to 16 MHz in steps of 2. The bands from —Sv BB / 2 to 0, and from 0 to 5v BB / 2, which 
are the lower, and the upper side bands respectively, will be processed separately. The 
FX Correlator at the GMRT will produce 128 spectral channels (0 - 127) covering each 
of these bands. The 0th channel corresponds to a frequency of v 0 + ^bblo ~ ^if and 
the frequency increases with channel number in the USB spectrum and decreases with 
channel number in the LSB spectrum. 

While setting the LO frequencies one needs to make sure that (a) the desired LO fre¬ 
quency is in the allowed range and that the oscillator is 'locked' to a stable reference, 
and, (b) that the required power output is available from the oscillator. The choice of the 
baseband filter bandwidth depends on the velocity resolution and the velocity coverage 
required for a given observation. In addition, it is preferable to have as many line-free 
channels in the band as there are channels with the line in order to be able to obtain a 
good estimate of the observed baseline (or reference spectrum). One would also like to 
center the spectral feature within the observed band so that line-free channels on either 
side can be used to estimate the baseline. The velocity resolution should be at least a 
factor of two better than the full width at half ma xi mum of the narrowest feature one is 
expecting to detect. 

At present, the FX Correlator at the GMRT produces 128 channels per side band for 
each of the two polarizations. The two polarizations are identified as the 130 MHz and 
the 175 MHz channels. In principle it should be possible to drop one of the polarizations 
to obtain 256 channels for one polarization. This will improve the spectral resolution by 
a factor of 2 keeping the velocity coverage (the bandwidth) the same. This can be very 
useful in observing narrow lines over a wider range of velocities. 


13.4 Calibration 

The observed spectrum has to be corrected for the telescope response as a function of 
frequency across the band to obtain an estimate of the true spectrum. The telescope 
response is in general complex with both amplitude and phase variations across the ob¬ 
serving band. This overall response across the band can be split into two components 
: (1) an overall gain (amplitude and phase) of the telescope for a reference radio fre¬ 
quency (RF) within the observing band, and (2) a variation of this gain across channels 
(the bandshape). The telescope response is thus a combination of RF gain calibration 
and IF bandshape calibration. This way of looking at the telescope calibration is useful 
since the requirements for determining these two parts of the telescope response can be 
different. For e.g., the IF bandshape variation is expected to be slower in time than the 
RF gain variation and hence need to be estimated less often. The spectral scale for the IF 
bandshape is however narrower compared to that of the RF gain. 
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13.4.1 Gain Calibration 

This is usually achieved by observing a bright, unresolved source which is called a cal¬ 
ibrator. In the case of a synthesis array like, for e.g., the GMRT, the gain calibration 
amounts to estimating the gains of the individual antennas in the array. The gains of any 
given pair of antennas reflect in the visibility (or the cross correlation) of the calibrator 
measured by them. In an array with N antennas, there are N(N-l)/2 independent esti¬ 
mates of the calibrator (an unresolved bright source) visibility at any give instant of time. 
However, there are only 2N unknowns, viz., N amplitudes and N phases of the N anten¬ 
nas. Hence, the measured visibilities can be used in a set of simultaneous equations to 
solve for these 2N unknowns. In practice, a calibrator close (in direction) to the source 
is observed for a suitable length of time using the same setup as that for the spectral 
line observations towards the source. A suitable number of spectral channels are aver¬ 
aged to improve the signal-to-noise ratio on the calibrator which is then used to estimate 
the gains of the antennas. Apart from the instrumental part, the gains include atmo¬ 
spheric offsets/contributions also. The proximity of the calibrator to the source ensures 
that the atmospheric offsets/contributions are similar in both observations and hence 
get corrected for through the 'calibration' process. 

How often does one do the calibration depends on various factors, like for e.g., the 
observing frequency, the length of the baseline involved, the telescope characteristics, the 
time scale for variations in the atmospheric offsets/contributions, etc.. The frequency of 
calibration can vary from once in ~ 10 minutes to once in an hour depending on these 
factors. 


13.4.2 Bandshape Calibration 

In this case too, a bright, unresolved source is used as a calibrator but the nearness 
requirement (as in the gain calibration) is not essential. On the other hand, the calibrator 
should not have any spectral features in the band of interest. The measured visibilities 
from the calibrator across the band of interest can once again (like in the earlier gain 
calibration) be used to estimate the antenna bandshapes. The observed spectrum from 
the source is divided by the bandshapes to obtain the true spectrum. The bandshape 
should have a signal-to-noise ratio (snr) significantly greater than that of the observed 
spectrum so that the snr in the corrected spectrum is not degraded. For e.g., if the 
bandshape and the observed spectrum have equal snr, then the corrected spectrum will 
have an snr which is square root of 2 worse (assuming gaussian statistics of noise). 
Ideally, one wouldn’t want the corrected spectrum to degrade in its snr by more than ~ 
10%. This can be used as a criterion to judge if a given calibrator is bright enough and to 
decide the amount of integration time required for the source and for the calibrator. 

There are two methods of bandshape calibration. 

(1) Position Switching : In this method, the telescope cycles through the source and a 
bandshape calibrator but observing both at the same frequency and bandwidth. Depend¬ 
ing on the accuracy to which the corrected bandshape is required, and the stability of the 
receiver, the frequency of bandshape calibration can vary from once in ~20 minutes to 
once in a few hours. 

(2) Frequency Switching : There are situations when position switching is not a suit¬ 
able scheme to do the bandshape calibration. This can happen due to (at least) two 
reasons : (a) the band of interest covers the Galactic HI. In this situation, all calibrators 
will also have some spectral feature within this band due to the ubiquitous presence of 
Galactic HI. No calibrator is suitable for bandshape calibration, (b) The band is outside 
the Galactic HI but the source of interest is a bright unresolved source. In this case one 
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might end up observing any other calibrator much longer (~ 10 times) than the source 
in order to achieve the desired signal-to-noise ratio on the bandshape. In either of these 
situations position switching is not desirable. An alternative scheme is employed. 

If a spectral feature covers a bandwidth of 8v centered at v, quite often it is possible to 
find line-free regions in the bands centered at v ± 8v. The bandshapes at these adjacent 
frequencies can be used to calibrate the observed spectrum. This works well because 
the bandshape is largely decided by the narrowest band in the signal path through the 
telescope. This is usually decided by the baseband filter. The bandwidth of this filter is 
selected to be the same while observing at frequencies v — Si/, v, and v + 8v. It is important 
to keep in mind that frequency switching works as long as 8v is small compared with the 
bandwidth of the front-end devices, and feeds. This is usually the case. For e.g., at the 
GMRT, the 21-cm feeds have a wide-band response, over 500 MHz. This is divided into 
4 sub-bands each of 120 MHz width. If the amount by which the frequency is switched 
is small compared to 120 MHz this technique should work quite satisfactorily. A typical 
frequency switching observation would thus have an “offl”, “on”, and an “off2” setting. 
The “on” setting centers the band at the spectral feature of interest (at v ) with a bandwidth 
of Si/ while the “offl” and “off2” settings will be centered at. i/ — Si/ and v + Si/ respectively. 
The three settings will be cycled through with appropriate integration times. The average 
of the “offl” and “off2” bandshapes can be the effective bandshape to calibrate the “on” 
spectrum. In this situation, equal amounts of time are spent “off’ the line and “on” the 
line to achieve the optimum signal-to-noise ratio in the final spectrum. However, the 
switching frequency itself will depend on the receiver stability, and the flatness of the 
corrected bandshape required. This could vary from once in ~20 minutes to once in a 
few hours. 

There are situations when one might to do both frequency and position switching. If 
one is observing Galactic HI absorption towards a weak continuum source, it is advan¬ 
tageous to obtain bandshape calibration by observing a brighter continuum source with 
frequency switching. 


13.5 Smoothing 

The cross power spectrum is obtained by measuring the correlation of signals from dif¬ 
ferent antennas as a function of time offset between them. A spectrum with a bandwidth 
8v and N channels is produced by cross correlating signals sampled at interval of r with 
relative time offset in the range -Nt to (N-l)r, where r = 1/(2 Sv). Because of this trunca¬ 
tion in the offset time range amounting to a rectangular window, the resulting spectrum 
is equivalent to convolving the true spectrum by a Sine function. Thus, a delta func¬ 
tion in frequency (a narrow spectral line, for e.g.) will result in an appropriately shifted 
svn{N -kv / 8v) / (N irv / 8v) pattern, where Si //N is the channel separation. The full width at 
half ma xi mum of the Sine function is 1.2Sv/N. This is the effective resolution. Any sharp 
edge in the spectrum will result in an oscillating function of this form. This is called 
the Gibbs’ phenomenon. There are different smoothing functions that bring down this 
unwanted ringing, but at the cost of spectral resolution. One of the commonly used 
smoothing functions in radio astronomy is that due to Hanning weighting of the correla¬ 
tion function. This smoothing reduces the first sidelobe from 22% (for the Sine function) 
to 2.7%. The effective resolution will be 2Si//N. After such a smoothing, one retains only 
the alternate channels. For Nyquist sampled data, the Hanning smoothing is achieved by 
replacing every sample by the sum of one half of its original value and one quarter the 
original values at the two adjacent positions. 

Apart from Hanning smoothing which is required to reduce the ringing, additional 
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smoothing of the spectra might be desirable. The basic point being that a spectral line of 
given width will have the best signal-to-noise ratio when observed with a spectral resolu¬ 
tion that matches its width. This is the concept of 'matched-filtering' and is particularly 
important in detection experiments. 


13.6 Continuum Subtraction 

Quite often spectral line observations include continuum flux density present in the band. 
The continuum in the band can arise due to a variety of reasons. Ionized Hydrogen re¬ 
gions, for e.g., give rise to the radio recombination lines of Hydrogen due to bound-bound 
transitions and the radio continuum due to thermal bremsstrahlung. Galaxies can have 
strong non-thermal radio continuum as well as 21-cm-line emission and/or absorption. 
In addition, any absorption spectral line experiment involves a bright continuum back¬ 
ground source. In these and similar situations, detecting a weak spectral line in the 
presence of strong continuum contribution can be very difficult. Depending on the com¬ 
plexity of the angular distribution of the continuum flux density and that of the spectral 
feature this task might almost become impossible. 

The basic problem here is one of spectral dynamic range (SDR). The spectral dynamic 
range is the ratio of the weakest spectral feature that can be detected to the continuum 
flux density in the band. This is limited by the residual errors which arise due to a variety 
of reasons like, for e.g., the instrumental variations, the atmospheric gain changes, the 
deconvolution errors, etc.. Of these, the multiplicative errors limit the SDR depending 
on the continuum flux density in the band. Thus, if the multiplicative errors are at 1% 
level, and , if the continuum flux density in the band is 10 Jy, no spectral line detection 
is possible below 100 mJy. On the other hand, a continuum subtraction (if successful) 
will lead to a situation where the SDR is decided by the peak spectral line flux density 
rather than the continuum flux density. Apart from the continuum flux density any other 
systematics which have a constant value or a linear variation across frequency will be 
subtracted out in the continuum subtraction procedure. This can lead to improvements 
in the SDR by several orders of magnitude. 

There are several methods for subtracting the continuum flux density from a spectral 
line data. It is beyond the scope of this lecture to discuss all of these. A brief mention 
will be made of one of these simpler methods to illustrate some of the principles involved. 
In this method, which has been called 'visibility-based subtraction', a linear fit to the 
visibilities as a function of frequency is performed for every sample in time. This best-fit 
continuum can then be subtracted from the original visibilities. The resulting data can 
be Fourier transformed to produce continuum-free images. This method works quite well 
if the continuum emission is spread over a sufficiently small field of view. This limitation 
can be understood in the following way. Consider a two-element interferometer separated 
by d. Let each of the elements of the interferometer be pointing towards 9 0 which is also 
the fringe tracking (phase tracking) center. The phase difference between 0 O , and an angle 
9 close to this, is <f> = 2nvd(sin(9) — sin(9 0 ))/c, where, v is the observing frequency, and c is 
the velocity of light. For the present purpose of illustration, assume that 9 is in the plane 
containing the pointing direction [0 tl ) and d. The visibilities from a source at 9 will have 
the form A„cos(<^) and A„sin(b), where, A v is the amplitude of the source at v. Writing 
v = u 0 + 5v, and 9 = 9 0 + 69, where, ;/ 0 is the frequency of the center of the band, it can be 
shown that the frequency-dependent part of the phase is (f> v = 2ir6udcos(Oo)S0/ ;/ ( > A (J , where, 
c = v 0 A 0 . It is easy to see that the variation of visibilities as a function of frequency is linear 
if (j) <C 27r. This implies that 5v59/(v a 9 syn ) <C 1, where, 9 syn = A 0 /d. Thus, this method of 
continuum subtraction works if most of the continuum is within v 0 / 6v synthesized beams 
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from the phase tracking center. 


13.7 Line Profiles 

If the line width is greater than the spectral resolution one can discuss the variation of the 
intensity of the line as a function of frequency. This description, called the line profile, can 
be denoted by <f>{u). If the reason for the line width is thermal broadening or turbulent 
broadening, the line profile will have a gaussian profile such that (j){v) oc e -( i '- I 'i) 2 /('5 ! ') 2 , 
where //; is the frequency at the line center and 8v is the rms value of the gaussian. 
The width of the line refers to the full-width at half-ma xi mum and is equal to ~ 2.35 
8v. The observed width of the line (5v 0 ) and the true width of the line (<);//) are related by 
6vl = Sis? + Svf., where, 8v y is the width of each channel (spectral resolution). This simple 
relation is strictly true only when the spectral channels have a gaussian response. In 
addition, this is relevant if the widths of the spectral line and the spectral channel are 
comparable. 

Pressure broadened lines show Voigt profiles. This will have a Doppler (gaussian) 
profile in the center of the line whereas the wings are dominated by the Lorentz profile. 
Obviously an analysis of the line profile is crucial in understanding the physical condi¬ 
tions of the system producing the spectral line. 

Acknowledgments: I would like to thank A.A. Deshpande for a critical reading of the 
manuscript and for useful comments to improve its clarity. 
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Chapter 14 

Wide Field Imaging 


Sanjay Bhatnagar 


14.1 Introduction 

It has been shown in Chapter 2 that the visibility measured by the interferometer, ignoring 
the phase rotation, is given by 


V(u,v,w)= I{l,m)B(l,m)e- 2 ^ ul+vm+w(vl - l ' 2 - rn ' )) ™ 2 , (14.1.1) 

J J \/1 / * nx 

where ( u,v,w) defines the co-ordinate system of antenna spacings, ( l,m,n ) defines the di¬ 
rection cosines in the (u, v, w) co-ordinates system, I is the source brightness distribution 
(the image) and B is the far field antenna reception pattern. For further analysis we will 
assume B = 1, and drop it from all equations (for typing convenience 1 !) 

Eq. 14.1.1 is not a Fourier transform relation. For a small field of view ( l 2 + m 2 << 1) 
the above equation however can be approximated well by a 2D Fourier transform relation. 
The other case in which this is an exact 2D relation is when the antennas are arranged 
in a perfect East-West line. However often array configurations are designed to ma xi mize 
the uv-coverage and the antennas are arranged in a ‘Y’ shaped configuration. Hence, Eq. 
14.1.1 needs to be used to map full primary beam of the antennas, particularly at low 
frequencies. Eq. 14.1.1 reduces to a 2D relation also for non-EW arrays if the time of 
observations is sufficiently small (snapshot observations). 

In the first part of this chapter we will discuss the implications of approximating Eq. 
14.1.1 by a 2D Fourier transform relation and techniques to recover the 2D sky bright¬ 
ness distribution. 

The field of view of a telescope is limited by the primary beams of the antennas. To 
map a region of sky where the emission is at a scale larger than the angular width of the 
primary beams, mosaicing needs to be done. This is discussed in the second part of this 
lecture. 


^he same assumptuin has been made in Chapter 2 


1 
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14.2 Mapping with Non Co-planar Arrays 


14.2.1 Image Volume 

Let n = \Jl — l' 2 — m 2 be treated as an independent variable. Then one can write a 3D 
Fourier transform of V(u, v, w) with the conjugate variable for (u, v, w) being (/. m, n ), as 


F(l, m, n) 


V(u , v, w)e 2 ^ ul+vm+wn) dudvdw. 


(14.2.2) 


Substituting for V ( u , v. w) from Eq. 14.1.1 we get 

F(l, m, n) = 

(14.2.3) 

Using the general result 

6(1' - l) = J e- 2nm{l '- l) du, (14.2.4) 


we get 

F(l, to, n) = 


= = I 6(1' — l)S(m' — m)S (\/1 — l' 2 — to' 2 — n)dl'dm'. (14.2.5) 
v 1 — l 2 — TO 2 

This equation then provides the connection between the 2D sky brightness distribution 
given by 1(1, to) and the result of 3D Fourier inversion of V(u,v,w) given by F(l,m,n) 
referred to as the Image volume. 


F(l, m, n) 


1(1, m)6 (\/1 — l 2 — m 2 — n) 
\/l — l 2 — m 2 


(14.2.6) 


Hereafter, I would use I(l,m, n) to refer to the this Image volume. 

In Eq. 14.1.1, we have ignored the fringe rotation term 2iuw in the exponent. This is 
done here only for mathematical (and typing!) convenience. The effect of including this 
term would be a shift of the Image volume by one unit in the conjugate axis, namely n. 
Hence, the effect of fringe stopping is to make the top most plane of 1(1, m,n) tangent to 
the phase center position on the celestial sphere with the rest of the sphere completely 
contained inside the Image volume as shown in Fig. 14.1. 

Remember that the third variable n of the Image volume is not an independent variable 
and is constrained to be n = \J\ -l 2 — m 2 . Eq 14.2.6 then gives the physical interpretation 
of 1(1, to, n). Imagine the celestial sphere defined by (l, to, n) enclosed by the Image volume 
1(1, to, n), with the top most plane being tangent to the celestial sphere as shown in Fig. 
14.1. Eq. 14.2.6 then says that only those parts of the Image volume correspond to 
the physical emission which lie on the surface of the celestial sphere. Note that since the 
visibility is written as a function of all the three variables (u,v,w), the transfer function will 
also be a volume. A little thought will then reveal that 1(1, to, n) will be finite away from the 
surface of the celestial sphere also, but that would correspond to non-physical emission 
in the Image volume due to the side lobes of the telescope transfer function (referred to by 
Point spread function (PSF) or Dirty beam in the literature). A 3D deconvolution using the 
Dirty image- and the Dirty beam-volumes will produce a Clean image-volume. Therefore, 
after deconvolution, one must perform an extra operation of projecting all points in the 
image volume along the celestial sphere onto the 2D tangent plane to recover the 2D sky 
brightness distribution. Fig. 14.2 is the graphical equivalent of the statements in this 
paragraph. 



14.2. MAPPING WITH NON CO-PLANAR ARRAYS 


3 



Figure 14.1: Graphical representation of the geometry of the Image volume and the celes¬ 
tial sphere. The point at which the celestial sphere touches the first plane of the Image 
volume is the point around which the 2D image inversion approximation is valid. For 
wider fields, emission at points along the intersection of celestial sphere and the various 
planes (labeled here as the celestial sphere) needs to be projected to the tangent plane 
to recover the undistorted 2D image. This is shown for 3 points on the celestial sphere, 
projected on the tangent plane, along the radial directions. 
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l 

Celestial sphere 


Figure 14.2: Graphical illustration to compute the distance between the tangent plane 
and a point in the sky at an angle of 8. 


14.2.2 Interpretation of the w-term 


The term w\/l — l 2 — m 2 is often referred to as the t/j-term in the literature. The origin of 
this term is purely geometrical and arises due to the fact that fringe rotation effectively 
phases the array for a point in the sky referred to as the phase center direction. A wave 
front originating for this direction will then be received by all antennas and the signals 
will be multiplied in-phase at the correlator (effectively phasing the array). The locus of 
all points in 3D space, for which the array will remain phased is a sphere, referred to as 
the celestial sphere. A wave front from a point away from the phase tracking center but 
on the surface of such a sphere, will carry an extra phase, not due to the geometry of the 
array but because of its separation from the phase center. In that sense, the phase of the 
wavefront measured by a properly phased array in fact carries the information about the 
source structure and the w-term is the extra phase due to the spherical geometry of the 
problem. The sky can be approximated by a 2D plane close to the phase tracking center 
and the zc-term can be ignored, which is another way of saying that a 2D approximation 
can be made for a small field of view. However sufficiently far away from the phase 
center, the phase due to the curvature of the celestial sphere, the w-term, must be take 
into account, and to continue to approximate the sky as a 2D plane, we will have to 
rotate the visibility by the w-term. This will be equivalent to shifting the phase centre 
and corresponds to a shift of the equivalent point in the image plane. Since the w-term 
is a function of the image co-ordinates, this shift is different for different parts of the 
image. Shifting the phase centre to any one of the points in the sky, will allow a 2D 
approximation only around that direction and not for the entire image. Hence the errors 
arising due to ignoring the w-term cannot be removed by a constant phase rotation of all 
the visibilities. This is another way of understanding that, in the strict sense, the sky 
brightness is not a Fourier transform of the visibilities. 




14.2. MAPPING WITH NON CO-PLANAR ARRAYS 


5 


14.2.3 Inversion Of Visibilities 

3D Imaging 

The most straight forward method suggested by Eq. 14.2.5 for recovering the sky bright¬ 
ness distribution, is to perform a 3D Fourier transform of V(u,v,w). This requires that 
the w axis be also sampled at least at Nyquist rate. For most observations it turns out 
that this is rarely satisfied and doing a FFT on the third axis would result into severe 
aliasing. Therefore in practice, the transform on third axis is usually done using the 
direct Fourier transform (DFT), on the un-gridded data. 

For performing the 3D FT (FFT on the u and v axis and DT on the w axis) one would 
still need to know the number of planes needed along the n axis. This can be found using 
the geometry as shown in Fig. 14.2. The size of the synthesized beam in the n direction 
is comparable to that in the other two directions and is given by « A /B max where B max is 
the longest projected baseline length. Therefore the separation between the planes along 
n should be < A/2 B max . The distance between the tangent plane and points separated by 
9 from the phase center is given by 1 — cos(9) ~ 9 2 / 2. For critical sampling then would be 

N n = B max 9 2 / A. (14.2.7) 

At 327 MHz for GMRT, B max « 25 km. Therefore, for mapping 1° field of view without 
distortions, one would required 8 planes along the n axis. With central square alone 
however, one plane should be sufficient. At these frequencies it becomes important to 
map most of the primary beam since the number and the intensity of the background 
sources increase and the side lobes of these background sources limit the dynamic range 
in the maps. Hence, even if the source of interest is small, to get the achievable dynamic 
range (or close to it!), one will need to do a 3D inversion (and deconvolution). 

Another reason why more than one plane would be required for very high dynamic 
range imaging is as follows. Strictly speaking, the only point which completely lies in the 
tangent plane is the point at which the tangent plane touches the celestial sphere. All 
other points in the image, even close to the phase center, lie slightly below the tangent 
plane. Deconvolution of the tangent plane then results into distortions for the same 
reason as the distortions arriving from the deconvolution of a point source which lies 
between two pixels in the 2D case. As in the 2D case, this problem can be minimized by 
over sampling the image and that, in this case, implies having at least 2 planes in the n 
axis, even if the Eq. 14.2.7 tells that 1 plane is sufficient. 


Polyhedron Imaging 

As mentioned above, emission from the phase center and from points close to it lie ap¬ 
proximately in the tangent plane. Polyhedron imaging relies on exploiting this fact by 
approximating the celestial sphere by a number of tangent planes as shown in Fig. 14.3. 
The visibility data is phase rotated to shift the phase center to the tangent points of the 
various planes and a small region around the tangent point is then mapped using the 2D 
approximation. In this case however, one needs to perform a joint deconvolution involv¬ 
ing all tangent planes since the sides lobes of a source in one plane would leak into other 
planes as well. 

The number of planes required to map an object of size 9 can be found simply by 
requiring that maximum separation between the tangent plane and the region around 
each tangent point be less than A /B max , the size of the synthesized beam. As shown 
earlier, the separation of a point 9 degrees away from the tangent point is « 9 2 . Hence for 
critical sampling, the number of planes required is equal to the solid angle subtended by 
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Figure 14.3: Approximation of the celestial sphere by multiple tangent planes (polyhedron 
imaging). 


the sky being mapped {(fj) divided by the solid angle of the synthesized beam (9 2 ) 

Npoiy = 26 2 f B max /\ = 2B max X/D 2 (for Of = full primary beam). (14.2.8) 

Notice that the number of planes required is twice as many as the number of planes 
required for 3D inversion. However since a small portion around the tangent point of 
each plane is used, the size of each of these planes can be small, offsetting the increase 
in computations due to the increase in the number of planes required. Another approach 
which is often taken for very high dynamic range imaging is to do a full 3D imaging on 
each of the planes. This would effectively increase the size of the field that can be imaged 
on each tangent plane, thereby reducing the number of planes required. 

The polyhedron imaging scheme is available in the current version of AIPS data re¬ 
duction package and the 3D inversion (and deconvolution) is implemented in the (not any 
more supported) SDE package written by Tim Cornwell et al. Both these schemes, in their 
full glory, will be available in the (recently released) AIPS++ package. 


14.3 Mosaicing 

The problem due to non co-planarity discussed above are for mapping the sky within the 
primary beam of the antennas (which are assumed to be identical). In this section we 
discuss the techniques used to handle the problem of mapping fields of interest which 
are larger than the primary beam of the antennas. The approach used is similar to that 
used for mapping with a single dish, namely to scan the source to be mapped. The fact 
that we are using an interferometer to synthesis the “lens” (or the a “single dish”) adds 
some more complications. 

These techniques are useful for mapping with interferometers operating in the mil¬ 
limeter range where the size of the primary beams is less than an arcmin and at meter 
wavelengths where the primary beams are larger but so is the extent of emission. For 
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example, the primary beam of GMRT antennas at 327 MHz is « 1.3° and there are map¬ 
ping projects which would benefit from mapping regions of the sky larger than this (for 
example, in the Galactic plane). 


14.3.1 Scanning Interferometer 


The co-planar approximation of Eq. 14.1.1 for a pointing direction given by ( l 0 ,m 0 ) can 
be written as 


V(u,v,l 0 ,m 0 ) 


J J 1(1, m)B(l - la, TO - m 0 )e 2 ^ ul+vm) dldm. 


(14.3.9) 


Here we also assume that B is independent of the pointing direction and we label V 
with not just the (u,v) co-ordinates, but also with pointing direction since visibilities for 
different directions will be used in the analysis that follows. The advantage of writing 
the visibility as in Eq. 14.3.9 is that the pointing center (given by (l 0 ,m 0 )) and the phase 
center (given by (l, to) = (0,0)) are separated. 

E(0,0 ,l 0 ,m 0 ) represents the single dish observation in the direction (l 0 ,m 0 ) and is just 
the convolution of the primary beam with the source brightness distribution, exactly as 
expected intuitively. Extending the intuition further, as is done in mapping with a sin¬ 
gle dish, we need to scan the source around (, l 0 ,m 0 ) with the interferometer, which is 
equivalent to scanning with a single dish with a primary beam of the size of the synthe¬ 
sized beam of the interferometer. Then Fourier transforming V(u, v, l 0 , m 0 ) with respect to 
Uo,m 0 ), assuming that B is symmetric, one gets, from Eq. 14.3.9 

[ ( V(u,v,l 0 ,m 0 )e 2nL( ' Uolo+Vorn °' l dlodm 0 = b(u 0 ,v 0 )i(u + u 0 ,v + v 0 ), (14.3.10) 


where (u 0 ,v 0 ) corresponds to the direction ( l 0 ,m 0 ) and h B and i == 1. This equation 
essentially tells us the following: Fourier transform of the visibility with respect to the 
pointing directions, from a scanning interferometer is equal to the visibility of the entire 
source modulated by the Fourier transform of the primary beams for each pointing direc¬ 
tion. For a given direction (l 0 ,m 0 ) we can recover spatial frequency information spread 
around a nominal point (u, v) by an amount D/X where D is the size of the dish. In terms 
of information, this is exactly same as recovering spatial information smaller than the 
size of the resolution of a single dish by scanning the source with a single dish. As in the 
case of a single dish, continuous scanning is not necessary and two points separated by 
half the primary beam is sufficient. In principle then, by scanning the interferometer, one 
can improve the short spacings measurements of V, which is crucial for mapping large 
fields of view. 

Image of the sky can now be made using the full visibility data set (made using the Eq. 
14.3.10). However, this involves the knowledge of Fourier transform of the sky brightness 
distribution, which in-turn is approximated after deconvolution. Hence, in practice one 
uses the MEM based image recovery where one maximizes the entropy given by 

H = ~Y / (14.3.11) 


with x 2 evaluated as 


X 


2 


E 


| V(uk,Vk,l 0 k,m 0 k) - V M (u k ,v k ,l ok ,m ok )\ 

2 

a V(u k ,Vk,lok,m ok ) 


(14.3.12) 


where V M (u k , v k , l ok , rn 0 k) is the model visibility evaluated using Eq. 14.3.9. For calcula¬ 
tion of A\ 2 in each iteration is estimated by the following steps: 
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• initialize Ay 2 = 0 

• For all pointings 

1. Apply the appropriate primary beam correction to the current estimate of the 
image 

2. FT to generated V M 

3. Accumulate y 2 

4. Subtract from the observed visibilities 

5. Make the residual image 

6. Apply the primary beam correction to the residual image 

7. Accumulate Ay 2 

The operation of primary beam correction on the residual image is understood by the 
following argument: For any given pointing, an interferometer gathers radiation within 
the primary beam. In the image plane then, any feature, outside the range of the primary 
beam would be due to the side lobes of the synthesized beam and must be suppressed 
before computation of Ay 2 and this is achieved by primary beam correction, which es¬ 
sentially divides the image by gaussian which represents the main lobe of the antenna 
radiation pattern. 

This approach (rather than joint deconvolution) has several advantages. 

1. Data from potentially different interferometers for different pointings can be used 

2. Weights on each visibility from each pointing are used in the entire image recon¬ 
struction procedure 

3. Single-dish imaging emerges as a special case 

4. It is fast for extended images 

The most important advantage that one gets by MEM reconstruction is that the de- 
convolution is done simultaneously on all points. That this is an advantage over joint- 
deconvolution can be seen as follows: If a point source at the edge of the primary beam 
is sampled by 4 different pointings of the telescope, this procedure would be able to use 
4 times the data on the same source as against data from only one pointing in joint- 
deconvolution (where deconvolution is done separately on each pointing). This, apart 
from improvement in the signal-to-noise ratio also benefits from a better uv- coverage 
available. 

Flexible software for performing Mosaic-ed observations is one of the primary moti¬ 
vation driving the AIPS++ project in which algorithms to handle mosaic-ed observations 
would be available in full glory. 


14.4 Further Reading 

1. Interferometry and Synthesis in Radio Astronomy; Thompson, A. Richard, Moran, 
James M., Swenson Jr., George W.; Wiley-Interscience Publication, 1986. 

2. Synthesis Imaging In Radio Astronomy; Eds. Perley, Richard A., Schwab, Frederic 
R., and Bridle, Alan H.; ASP Conference Series, Vol 6. 



Chapter 15 

Polarimetry 


Jayaram N. Chengalur 


15.1 Introduction 

Consider the simplest kind of electromagnetic wave, i.e. a plane monochromatic wave 
of frequency v propagating along the +Z axis of a cartesian co-ordinate system. Since 
electro-magnetic waves are transverse, the electric field E must lie in the X-Y plane. 
Further since the wave is mono-chromatic one can write 

E(t) = E x cos(27 rut)e x + E y cos(27 rut + S)e y , (15.1.1) 

i.e. the X and Y components of the electric field differ in phase by a factor which does not 
depend on time. It can be shown 1 that the implication of this is that over the course of 
one period of oscillation, the tip of the electric field vector in general traces out an ellipse. 
There are two special cases of interest. The first is when 5 = 0. In this case the tip of the 
electric field vector traces out a line segment, and the wave is said to be linearly polarized. 
The other special case is when E x = E y and 6 = ±n/2. In this case the electric field vector 
traces out a circle in the X-Y plane, and depending on the sense 2 in which this circle is 
traversed the wave is called either left circular polarized or right circular polarized. 

As you have already seen in chapter 1, signals in radio astronomy are not monochro¬ 
matic waves, but are better described as quasi-monochromatic plane waves 3 . Further, 
the quantity that is typically measured in radio astronomy is not related to the field (i.e. 
a voltage), but rather a quantity that has units of voltage squared, i.e. related to some 
correlation function of the field (see chapter 4). For these reasons, it is usual to char¬ 
acterize the polarization properties of the incoming radio signals using quantities called 
Stokes parameters. Recall that for a quasi monochromatic wave, the electric field E could 
be considered to be the real part of a complex analytical signal Elf). If the X and Y com¬ 
ponents of this complex analytical signal are £ x (t), and £ y (t), respectively, then the four 


1 See for example, Born & Wolf ‘Principles of Optics’, Sixth Edition, Section 1.4.2 

2 Note that there Is an additional ambiguity here, I.e. are you looking along the direction of propagation of the 
wave, or against It? To keep things interesting neither convention is universally accepted, although In principle 
one should follow the convention adopted by the LAU (Transactions of the LAU Vol. 15B, (1973), 166.) 

3 Recall that as all astrophysically interesting sources are distant, the plane wave approximation is a good 
one 
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Stokes parameters are defined as: 


I = 

<r f f * ■> -i - <c f f* 

^ c 'X<-'x ^ ^ ^ 


< £ x£* > 

— (I + Q)/ 2 

Q = 

< £ x £* > - < £y£* > 

or 

< £ y £ y > 

= (I-Q)/ 2 

u = 

< £ x £* > + < £*£ v > 

< £x£y > 

= (U + iV)/2 

V = 

H< £ * £ y > < £ l £ v >) 


< £ x £ y > 

= (U-iV)/ 2 


where the angle brackets indicate taking the average value 4 . The Stokes parameters as 
defined in equation (15.1.2) clearly depend on the orientation of the co-ordinate system. 
In radio astronomy it is conventional (see chapter 10) to take the +X axis to point north 
and the +Y axis to point east. It is important to realize that the Stokes parameters are 
descriptors of the intrinsic polarization state of the electro-magnetic wave, i.e. the Stokes 
vector {I Q U V) T is a true vector. The equations (15.1.2) simply give its components in 
a particular co-ordinate system, the linear polarization co-ordinate system 5 . One would 
instead work in a circularly polarized reference frame, i.e. where the electric field is de¬ 
composed into two circularly polarized components, £ r (t), and £i(t). The relation between 
these components and the Stokes parameters are: 


I = < £ r £* > + < £i£* > 

Q = < £ r £f > + < £*£i > 

U = \{<£ r £: >- <£* r £i >) 

V = < £ r £* > - < £i£* > 


< £ r £ r * > = (7 + V)/2 

< £ t £* > = (7 - V)/2 

< £ r £* > = (Q + iU)/2 

< £*£i > = (Q- iU)/2. 


(15.1.3) 


Interestingly, equations (15.1.3) are formally identical to equations (15.1.2) apart from 
the following transformations viz. Q + —> V°, U + —> Q & , V + —> [7®, where the superscript 
+ indicates linear polarized co-ordinates and © circular polarized co-ordinates. Although 
these two co-ordinate systems are the ones most frequently used, the Stokes vector could 
in principle be written in any co-ordinate system based on two linearly independent (but 
not necessarily orthogonal) polarization states. In fact, as we shall see, such non orthogo¬ 
nal co-ordinate systems will arise naturally when trying to describe measurements made 
with non ideal radio telescopes. 

The degree of polarization of the wave is defined as 


P = 


\JQ 2 + U 2 + V 2 
I 


(15.1.4) 


From equation (15.1.2) we have 

I 2_ Q 2_ u 2_y2 = 2 ( g 2) _ e x £ y ^ ( 15 . 1 . 5 ) 

and hence from the Schwarz inequality it follows that 0 < P < 1 and that P = 1 iff £ x = c £ y , 
where c is some complex constant. For a mono-chromatic plane wave (equation (15.1.1)) 
therefore, P = 1 or equivalently I 2 = Q 2 + U 2 + V 2 , i.e. there are only three independent 
Stokes parameters. For a general quasi mono-chromatic wave, P < 1, and the wave is 
said to be partially polarized. 

It is also instructive to examine the Stokes parameters separately for the special case 
of a monochromatic plane wave. We have (see equations (15.1.1) and (15.1.2)): 

I = E 2 + E 2 U = 2E x E y cos(J) 

Q = E 2 -E 2 V = 2E x E y sin(<5), 

4 Strictly speaking this Is the ensemble average. However, as always, we will assume that the signals are 
ergodic, I.e. the ensemble average can be replaced with the time average. 

5 These polaraization co-ordinate systems are of course in some abstract polarization space and not real space 
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i.e. for a linearly polarized wave (5 = 0) we have V = 0, and for a circularly polarized wave 
(E x — E y ,6 = ±7r/2) we have Q U = 0. So Q and U measure linear polarization, and 
V measures circular polarization. This intepretation continues to be true in the case of 
partially polarized waves. 


15.2 Polarization in Radio Astronomy 

Emission mechanisms which are dominant in low frequency radio astronomy, produce 
linearly polarized emission. Thus extra-galactic radio sources and pulsars are predom¬ 
inantly linearly polarized, with polarization fractions of typically a few percent. These 
sources usually have no circular polarization, i.e. V ~ 0. Maser sources however, in 
particular OH masers from galactic star forming regions often have significant circular 
polarization. This is believed to arise because of Zeeman splitting. Interstellar maser 
sources also often have some linear polarization, i.e. all the components of the Stokes 
vector are non zero. In radio astronomy the polarization is fundamentally related to the 
presence of magnetic fields, and polarization studies of sources are aimed at understand¬ 
ing their magnetic fields. 

The raw polarization measured by a radio telescope could differ from the true polar¬ 
ization of the source because of a number of effects, some due to propagation of the 
wave through the medium between the source and the telescope, (see chapter 16) and 
the other because of various instrumental non-idealities. Since we are eventually inter¬ 
ested in the true source polarization our ultimate aim will be to correct for these various 
effects, and we will therefore find it important to distinguish between depolarizing and 
non-depolarizing systems. A system for which the outgoing wave is fully polarized if the 
incoming wave is fully polarized is called non-depolarizing. The polarization state of the 
output wave need not be identical to that of the incoming wave, it is only necessary that 
P out = 1 if P in = 1. 

The most important propagation effect is Faraday rotation, which is covered in some 
detail in chapter 16. Here we restrict ourselves to stating that the plane of polarization 
of a linearly polarized wave is rotated on passing through a magnetized plasma. Faraday 
rotation can occur both in the ISM as well as in the earth’s ionosphere. If the Faraday 
rotating medium is mixed up with the emitting region, then radiation emitted from differ¬ 
ent depths along the line of sight are rotated by different amounts, thus reducing the net 
polarization. This is called Faraday depolarization. If the medium is located between the 
source and the observer, then the only effect is a net rotation of the plane of polarization, 
i.e. 

£ x = £ x cos y + £ y sin y, £ y = -£ x sin y + £ y cos y, (15.2.6) 

where £ x , £ x are the X components of the incident and emergent field respectively and 
similarly for £ y , £ y . In terms of the Stokes paramters, the transformation on passing 
through a Faraday rotating medium is 

i' = I Q' = Q cos 2y + U sin 2y (15 2 7) 

V' = V U' = —Q sin2y +U cos2y. 

i.e. a rotation of the Stokes vector in the (U,V) plane. The fractional polarization is 
hence preserved 6 . Equation (15.2.7) can also be easily obtained from equation (15.1.3) 

®Note that non-depolarizing only means that Pout = 1 if Pin = 1, and this does not necessarily translate into 
conservation of the fractional polarization when P < 1. Pure faraday rotation is hence not only non-depolarizing, 
it also preserves the fractional polarization. 
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by noting that in a circularly polarized co-ordinate system, the effect of faraday rotation 
is to introduce a phase difference of 2% between £ r and £/. 

Consider looking at an extended source which is not uniformly polarized with a radio 
telescope whose resolution is poorer than the angular scale over which the source polar¬ 
ization is coherent. In any given resolution element then there are regions with different 
polarization characteristics. The beam thus smoothes out the polarization of the source, 
and the measured polarization will be less than the true source polarization. This is 
called beam depolarization. Beam depolarization cannot in principle be corrected for, the 
only way to obtain the true source polarization is to observe with sufficiently high angular 
resolution. 

A dual polarized radio telescope has two voltage beam patterns, one for each polariza¬ 
tion. These two patterns are often not symmetrical, i.e. in certain directions the telescope 
response is greater for one polarization than for the other. The difference in gain be¬ 
tween these two polarizations usually varies in a systematic way over the primary beam. 
Because of this asymmetry, an unpolarized source could appear to be polarized, and fur¬ 
ther its apparent Stokes parameters in general depend on its location with respect to the 
center of the primary beam. The polarization properties of an antenna are also sharply 
modulated by the presence of feed legs, etc. and are hence difficult to determine with 
sufficient accuracy. For this reason determining the polarization across sources with di¬ 
mensions comparable to the primary beam is a non trivial problem. Given the complexity 
of dealing with extended sources, most analysis to date have been restricted to small 
sources, ideally point sources located at the beam center. 

Most radio telescopes measure non-orthogonal polarizations, i.e. a channel p which is 
supposed to be matched to some particular polarization p also picks up a small quantity 
of the orthogonal polarization q. Further, this leakage of the orthogonal polarization in 
general changes with position in the beam. However, for reflector antennas, there is often 
a leakage term that is independent of the location in the beam, which is traditionally 
ascribed to non idealities in the feed. For example, for dipole feeds, if the two dipoles are 
not mounted exactly at right angles to one another, the result is a real leakage term, and 
if the dipole is actually matched to a slightly elliptical (and not purely linear) polarization 
the result is an imaginary leakage term. For this reason, the real part of the leakage is 
sometimes called an orientation error, and the imaginary part of the leakage is referred 
to as an ellipticity error 7 . However, one should appreciate that the actual measurable 
quantity is only the antenna voltage beam, (i.e. the combined response of the feed and 
reflector) and this decomposition into ‘feed’ related terms is not fundamental and need 
not in general be physically meaningful. 

The final effect that has to be taken into account has to do with the orientation of the 
antenna beam with respect to the source. For equitorially mounted telescopes this is a 
constant, however for alt-az mounted telescopes, the telescope beam rotates on the sky 
as the telescope tracks the source. This rotation is characterized by an angle called the 
parallactic angle, %p p , which is given by: 


tan^p = 


cos C sin H 

sin C cos S — cos C sin 6 sin Ti ’ 


(15.2.8) 


where C is the latitude of the telescope, 'H is the hour-angle of the source, and S is the 
apparent declination of the source. So if one observes a source at a parallactic angle 
with a telescope that is linearly polarized, the voltages that will be obtained at the 

7 Several telescopes, such as for example the GMRT, use feeds which are sensitive to linear polarization, but 
by using appropriate circuitry (viz a n/2 phase lag along one signal path before the first RF amplifier) convert the 
signals Into circular polarization. Non idealities in this linear to circular conversion circuit could also produce 
complex leakage terms even If the feed dipoles themselves are error free. 
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terminals of the X and Y receivers will be 

V x = G x (£ x cosip p + £ySmip p ), V y = G y (-£ x smtp p + £ y cosip p ), (15.2.9) 

where G x and G y are the complex gains (i.e. the product of the antenna voltage gains and 
the receiver gains) of the X and Y channels. 


15.3 The Measurement Equation 


In this section we will develop a mathematical formulation useful for polarimetric in¬ 
terferometry. The theoretical framework is the van Cittert-Zernike theorem, which was 
discussed in chapter 2 in the context of the reconstruction of the Stokes I parameter of 
the source. However, as can be trivially verified, the theorem holds good for any of the 
Stokes parameters. So, apart from the issues of spurious polarization produced by prop¬ 
agation or instrumental effects, making maps of the Q, U, and V Stokes parameters is in 
principle 8 identical to making a Stokes I map. 

Not surprisingly, matrix notation leads to an elegant formulation for polarimetric in¬ 
terferometry 9 . Let us begin by defining a coherency vector, 


/ < £ap£bp > \ 

< £ap£bq > 

f f* \ ’ 

^ t'aq^bp ^ 

V < £aq£bq > / 

where a, b refer to the two antennas which compose any given baseline, and p, q are the 
two polarizations measured by the antenna. The coherency vector can be expressed as 
an outer product of the electric field, viz: 


/ < £ ap £* hp > \ 

< £ap£bq > 

< £aq£bp > 

\ < £aq£bq > / 


f 

G'ap 

f 

'-'aq 



(15.3.10) 


The Stokes vector can be obtained by multiplying the coherency vector with the Stokes 
matrix, (S). In a linear polarized co-ordinate system the components are: 


( 1 \ 


( 1 

0 

0 

1 \ 


( < £ax£bx > 


Q 

i 

1 

0 

0 

-1 


< £ax£* by > 


u 

~ 2 

0 

1 

1 

0 


< £ay£b x > 


\y ) 


^0 

—i 

i 

o ) 


V < £ay£}>y > 

) 


(15.3.11) 


The component form could also be written down in the circular polarized co-ordinate 
system, in which case the matrix S would be: 


( 1 \ 


( 1 

0 

0 

1 \ 


/ < £ ar £b r > 

\ 

Q 

i 

0 

1 

1 

0 


< £ ar £h > 


u 

“ 2 

0 

— i 

i 

0 


< £al£Zr > 


V v) 


l 1 

0 

0 

-1 } 


\ < £al£tl > 

) 


(15.3.12) 


8 apart from the fact that one has to record four correlation functions, < £ ap £l p >, < £a P £b q >, < £aq£b p >, 
< £aq£b q >, where a, b refer to the two antennas which compose any given baseline, and p, q are the two 
polarizations measured by the antenna. Since Stokes I maps are often all that is required, many observatories, 
including the GMRT. make a trade off such that fewer spectral channels are available if you record all four 
correlation products, than if you recorded only the two correlation products which are required for Stokes I. 

9 Although this formulation has been in use in the field of optical polarimetry for decades, it was not appreci¬ 
ated until recently (Hamaker et al. 1996, and Sault et al. 1996) that it is also extendable to radio interferometric 
arrays. 
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The matrix in equation (15.3.12) is related to that in equation (15.3.11) by a simple 
permutation of rows, as expected. 

The outer product has the following associative property, viz. for matrices, A,B, C, and 

D, 

(AB) ® (CD) = (A <g> C)(B ® D). 

For any one antenna a, putting in all the various effects discussed in section(15.2) we can 
write the voltage at the antenna terminals as: 


V 


a 


G a B 0 P a F a £ a 



(15.3.13) 


where, 

Vo = the voltage vector at the terminals of antenna a 
Go = the complex gain of the receivers of antenna a 
B a = the voltage beam matrix for antenna a 
P a = the parallactic angle matrix for antenna a 
F a = the Faraday rotation matrix for antenna a 
£ a = the electric field vector at antenna a 
J a = the Jones matrix for antenna a 

The Jones matrix has been so called because of its analogy with the Jones matrix in 
optical polarimetry. All of these matrices are 2x2. In the linear polarized co-ordinate 
system. For example, we have: 


F= ( C0S X sin X A P=( cos ^p sinipp \ 

y — sin x cos x ) y — sin ip p cos ifjp ) 

(15.3.14) 

13 { bpp(l, m) bpq(l . Til) \ „ f 9p 0 \ 

“ y b gp (l, to) b qq (l,m) ) V 0 9q ) 

The Jones matrix in polarimetric interferometry plays the same role as the complex 
gain does in scalar interferometry. Consequently one could conceive of schemes for self¬ 
calibration, since for an array with a large enough number of antennas sufficient number 
of closure constraints are available. However, since astrophysical sources are usually 
only weakly polarized, the signal to noise ratio in the cross-hand correlation products is 
often too low to make use of these closure constraints. 

In scalar interferometry, phase fluctations caused by the atmosphere and/or iono¬ 
sphere were lumped together with the instrumental gain fluctuations. In the vector for¬ 
mulation however, this is strictly speaking not possible, since these corrections occur at 
different points along the signal path, (see equations (15.3.13)) and matrices in equa¬ 
tions (15.3.14) do not in general commute. However, for most existing radio telescopes, 
and for sources small compared to the primary beam, the matrices in equations (15.3.14) 
(apart from the Faraday rotation and Parallactic angle matrices) differ from the identity 
matrix only to first order (i.e. the off diagonal terms are small compared to the diagonal 
terms, and the diagonal terms are equal to one another to zeroth order), and consequently 
these matrices commute to first order. To first order hence, it is correct to lump the phase 
differences accumulated at different points along the signal path into the receiver gain. 
Alternatively, if we make the (reasonable) assumption that the complex atennuation (i.e. 
any absorption and phase fluctuation) produced by the atmosphere is identical for both 
polarizations, then it can be modeled as a constant times the identity matrix. Since 
the identity matrix commutes with all the other matrices, this factor can be absorbed in 
the receiver gain matrix, exactly as was done when dealing with interferometry of scalar 
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fields. This is the reason why no separate matrix was introduced in equation (15.3.13) to 
account for atmospheric phase and amplitude fluctuations. 

The matrix B in this formulation also deserves some attention. It simply contains 
the information on the relation between the electric field falling on the source and the 
voltage generated at the antenna terminals. It is an extension of the voltage beam in 
scalar field theory, and each element in the matrix depends on the sky co-ordinates (1, m). 
As described above in section( 15.2), it is traditional to decompose it into a part which 
does not depend on which is called the leakage (or in the matrix formulation, the 

leakage matrix “D”), and a part which depends on ( l,m ). Provided that the leakage terms 
are small compared to the parallel hand antenna voltage gain, it can be shown that this 
decomposition is unique to first order. 

In terms of the Jones matrix, the measured visibility on a single baseline for a point 
at the phase center can be written as: 


/ V/ \ 
Vq 
Vu 

V Vv / 


SJaOJ^S " 1 


n\ 


u 


(15.3.15) 


Note that this is a matrix equation, valid in all co-ordinate frames, i.e. it holds regardless 
of whether the antennas are linear polarized or circular polarized. In fact it holds even if 
some of the antennas are linear polarized, and the others are circular polarized. 

If the point source were not at the phase center, then the visibility phase is not zero, 
and in equation (15.3.15), one would have to pre-multipy the Jones matrices with a matrix 
containing the Fourier kernel, viz. Kand K/,(/, mjdefined as: 


K a {l,m) 


g— 2n(u a l-\-v a 

o 


0 

0 —2n(u a l-\-v a Tn) 


K b{l,m) 


^—2-K(u b l+v b m) 

0 


0 

0 -2tt(u b l+v b m) 


(15.3.16) 


To get the visibility for an extended incoherent source, one would have to integrate 
over all ( l,m ), thus recovering the vector formulation of the van Cittert-Zernike theorem. 
In order to invert this equation, it is necessary not only to do the inverse fourier trans¬ 
form, but also to correct for the various corruptions introduced, i.e. the data has to be 
calibrated. The rest of this chapter discusses ways in which this polarization calibration 
can be done. 


15.4 Polarization Calibration 

We restrict our attention to a point source at the phase center 10 . The visibility that we 
measure, averaged over all baselines is 

V = N(N- (15.4.17) 

Any system describable by a Jones matrix is non-depolarizing 11 In the general case 
however, the summation in equation (15.4.17) cannot be represented by a single Jones 

10 For VLBI observations this is a very good approximation, since the source being imaged is very small com¬ 
pared to the primary beams of any of the antennas in the VLBI array. 

11 This follows trivially from the fact that for 100% polarization we must have S v = c£ q , where p, q are any two 
orthogonal polarizations, and c is some complex constant. Multiplication by the Jones matrix will preserve this 
relationship (only changing the value of the constant c) thus producing another 100% polarized wave. 
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matrix, and an interferometer is not therefore a non-depolarizing system. However, ide¬ 
ally, after calibration, the effective Jones matrices are all the unit matrix, and the inter¬ 
ferometer would then be non-depolarizing. 

Intuitively, it is clear that if one looks at an unpolarized calibrator source, one should 
be able to solve for the leakage terms, (which will produce apparent polarization) but that 
some degrees of freedom would remain unconstrained. Further it is also intuitive that 
the degrees of freedom which remain unconstrained are the following: (1) The absolute 
orientation of the feeds, (2) The intrinsic polarization of the feeds (i.e. for example, are 
they linear polarized or circular polarized?) and (3) The phase difference between the two 
polarizations. While one would imagine that the situation may be improved by observa¬ 
tion of a polarized source, it turns out that this too is not sufficient to determine all the 
free parameters. What is required is observations of at least three differently polarized 
sources. For alt-az mounted dishes, the rotation of the beam with respect to the sky 
changes the apparent polarization of the source. For such telescopes hence, it is suffi¬ 
cient to observe a single source at several, sufficiently different hour angles. This is the 
polarization strategy that is commonly used at most telescopes. Faraday rotation due to 
the earth’s ionosphere is more difficult to correct for. In principle models of the iono¬ 
sphere coupled with a measure of the total electron content at the time of the observation 
can be used to apply a first order correction to the data. 

We end this chapter with a brief description of the effect of calibration errors on the 
derived Stokes parameters. When observing with linearly polarized feeds, from equa¬ 
tion (15.1.2) it is clear that if one observes a linearly polarized calibrator, the parallel- 
hand correlations will contain a contribution due to the Q component of the calibrator 
flux. Consequently, if one assumes (erroneously) that the calibrator was unpolarized the 
gain of the X channel will be overestimated and that of the Y channel underestimated. 
For this reason, for observations which require only measurement of Stokes I, circular 
feeds are preferable, since the Stokes V component of most calibrators is negligible, and 
consequently, measurements of the parallel hand correlations 12 are sufficient to measure 
the correct Stokes I flux. 

It is easy to show, that (to first order) if one observes a polarized calibrator with an error 
free linearly polarized interferometer and solves for the instrumental parameters under 
the assumption that the calibrator is unpolarized, the derived instrumental parameters 
of all the antennas will be in error by 13 : 

A g x = +Q/2I Ag y = —Q/2I 

d x = (Q + iU)/2I d y = —(U — iQ)/2I. 

where: 

A g x is the gain error of the X channel. 

A g y is the gain error of the Y channel. 

d x is the leakage from the Y channel to the X channel. 

d y is the leakage from the X channel to the Y channel. 

If these calibration solutions are then applied to an unpolarized target source, then the 
source will appear to be polarized, with the same polarization percentage as the calibrator, 
but opposite sense. This again is simply the extension from scalar interferometry that if 
the calibrator flux is in error by some amount, the derived target source flux will be in 
error by the same fractional amount, but with opposite sense. 


12 recall from equations (15.1.3) that when V = 0, < £ r £* > + < £[£[ >= I. 

13 A similar result can of course be derived for the case of circularly polarized antennas, the only difference 
will be the usual transpositions of Q, U, and V. 
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Chapter 16 


Ionospheric effects in Radio 
Astronomy 

A. P. Rao 

16.1 Introduction 

At the low densities encountered in the further reaches of the earth’s atmosphere and 
in outer space, collisions between particles are very rare. Hence, unlike in a terrestrial 
laboratory, it is possible for gas to remain in an ionized state for long periods of time. 
Such plasmas are ubiquitous in astrophysics, and have been extensively studied for their 
own sake. In this chapter however, we focus on the effects of this plasma on radio waves 
propagating through them, and will find astrophysical plasmas to be largely of nuisance 
value. 

The refractive index of a cold neutral plasma is given by 


m(t) = 

where u p the “plasma frequency is given by 




9^/nl kHz 


( 16 . 1 . 1 ) 


( 16 . 1 . 2 ) 


where e is the charge on the electron, m e is the mass of the electron and n e is the electron 
number density (in cm -3 ). At frequencies below the plasma frequency v p the refractive 
index becomes imaginary, i.e. the wave is exponentially attenuated and does not propa¬ 
gate through the medium. The earth’s ionosphere has electron densities ~ 10 4 — 10 5 cm -3 , 
which means that the plasma frequency is ~ 1 — 10 MHz. Radio waves with such low 
frequencies do not reach the earth’s surface and can be studied only by space based tele¬ 
scopes. The plasma between the planets is called the Interplanetary Medium (IPM) and 
has electron densities ~ 1 cm -3 (at the earth’s location); the corresponding cut off fre¬ 
quency is ~ 9 kHz. The typical density in the Interstellar Medium (ISM) is ~ 0.03 cm -3 for 
which the cut off frequency is ~ 1 kHz. Waves of such low frequency from extra solar sys¬ 
tem objects cannot be observed even by spacecraft since the IPM and ISM will attenuate 
them severely. 
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The dispersion relationship in a cold plasma is given by c 2 k 2 = to 2 — Up. Since this 
is a non linear relation there are two characteristic velocities of propagation, the phase 
velocity given by 

= + (16 - L3) 
and the group velocity which is given by 


du 1 v 

= dk M <1_ 2i^> 


(16.1.4) 


Where for the last expression we have assumed that v » v p (which is usually the regime 
of interest). 


16.2 Propagation Through a Homogeneous Plasma 


Even above the cutoff frequency there are various propagation effects that are important 
for a radio wave passing through a plasma. Let us start with the most straightforward 
ones. Consider a radio signal passing through a homogeneous slab of plasma of length L. 
The signal is delayed (with respect to the propagation time in the absence of the plasma) 
by the amount 


AT=--^ = -(1/m-1) 

V n C C 


c 2 vi 


The magnitude of the propagation delay can hence be written as 


, A , L 4 x 10 6 
|AT| = — X -2- n e- 


'Hz 


The propagation delay can also be considered as an “excess path length” A L = c AT. 
Further since [v g /c — 1) and (v p /c — 1) differ only in sign 1 , the magnitude of the “excess 
phase” (viz. 2nv(L/v p - L/c )) is given by AT = 2-kvAT. Note that since the propagation 
delay is a function of frequency u, waves of different frequencies get delayed by different 
amounts. A pulse of radiation incident at the far end of the slab will hence get smeared 
out on propagation through the slab; this is called “dispersion”. If the plasma also has 
a magnetic field running through it then it becomes birefringent - the refractive index 
is different for right and left circularly polarized waves. A linearly polarized wave can 
be considered a superposition of left and right circularly polarized waves. On propaga¬ 
tion through a magnetized plasma the right and left circularly polarized components are 
phase shifted by different amounts, or equivalently the plane of polarization of the lin¬ 
early polarized component is rotated. This rotation of the plane of polarization on passage 
through a magnetized plasma is called “Faraday rotation”. The angle through which the 
plane of polarization is rotated is given by 


0 = RM A 2 = 0.81A 2 


n e B\\dl. 


and RM is called the rotation measure. For the second equality A is in meters, n e is in 
cm -3 , B || is in jiG and the length is in parsecs. 
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Figure 16.1: Propagation through a plane parallel ionosphere 

16.3 Propagation Through a Smooth Ionosphere 

For an interferometer, there are two quantities of interest (i) the delay difference between 
the signals reaching the two arms of the interferometer {61' = ATi — A T 2 ), where ATi 
and AT 2 are the propagation delays for the two arms of the interferometer, and (ii) the 
phase difference between the signals reaching the two arms of the interferometer (Sep = 
27r/A(ALi — AL 2 ), where ALi and A L 2 are the excess path lengths for the two arms of the 
interferometer. Generally ST is small compared to the coherence bandwidth of the signal 
and can be ignored to first order, however So could be substantial. 

In a homogeneous plane parallel ionosphere with refractive index // (see Figure 16.1), 
we have from Snell’s law /isin(z 0 ) = sin(z). The observed geometric delay is r g = fil) sin(2 0 )/c, 
since the group velocity is c // j . From Snell’s law therefore, r g = Dsm(z)/c, the same as 
would have been observed in the absence of the ionosphere. A homogeneous plane paral¬ 
lel ionosphere hence produces no net effect on the visibilities, even though the apparent 
position of the source has changed. In the case where the interferometer is located out¬ 
side the slab, there is neither a change in the apparent position nor a change in the 
phase, as is obvious from the geometry. This entire analysis holds for a stratified plane 
parallel ionosphere (since it is true for every individual plane parallel layer). However, 
in the real case of a curved ionosphere, with a radial variation of electron density, then 
neither the change in the apparent position nor S(p are zero even outside the ionosphere. 
Effectively, the direction of arrival of the rays from the distant source appears to be dif¬ 
ferent from the true direction of arrival (as illustrated in Figure 16.2) and unlike in the 
plane parallel case this is not exactly canceled out by the change in the refractive index. 
If Ad is the difference between the true direction and apparent directions of arrival, then 
one can compute that 


A 6 = 


A sin( 20 ) 
ro 



a 2 fj,{h)dh 
(1 - a 2 sin 2 ( 20 )) 


(16.3.5) 


where z 0 is the observed zenith angle, r 0 is the radius of the earth, h is the height above 
the earth’s surface and, n{h) is the refractive index at height h, and A is a constant. For 
baseline lengths typical of the GMRT, this value is the same for both arms of the baseline. 
If the baseline has UV co-ordinates (u,v), then the phase difference due to the apparent 
change in the source position is given by 


Atp = 2tt(uA0ew + vAOns)- 

ho first order for v >> i/ p , as can be easily verified. 
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Figure 16.2: Propagation through a curved ionosphere 



Max. Val 
(Day) 

Min Val 
(Night) 

Freq. Dependence 

TEC 

5 x 10 i3 cm -2 

5 x 10 i2 cm -2 

- 

Group Delay 

12 ^sec 

1.2 /isec 

z." 2 

Excess Path 

3500 m 

350 m 

z/- 2 

Phase Change 

7500 rad 

750 m 

z/- 2 

Phase Fluctuation 

±150 rad 

±15 rad 

z/- 2 

Mean Refraction 

6' 

0.6' 

z/- 2 

Faraday Rotation 

15 cycles 

1.5 cycles 

z/- 2 


Table 16.1: Typical numerical values of various ionospheric effects 


Typical values for some of the ionospheric prorogation effects that we have been dis¬ 
cussing are given in Table 16.1. 


16.4 Propagation Through an Inhomogeneous 
Ionosphere 

So far we have been dealing with an ionosphere, which, while not homogeneous, is still 
fairly simple in that the density fluctuations are smooth, slowly varying functions. Fur¬ 
ther, the ionospheric density was assumed to not vary with time. In reality, the earth’s 
ionosphere shows density fluctuations on a large range of length and time scales. A den¬ 
sity fluctuation of length scale l at a height h above the earth’s surface corresponds to a 
fluctuation on an angular scale of l/h. For a typical length scale / of 10 km, at a height of 
200 km, the corresponding angular scale is ~ 3°. This means that the phase difference 
introduced by the ionosphere changes on an angular scale of 3°. If this phase is to be 
calibrated out, then one would need to pick a calibrator that is within 3° of the target 
source — for most sources it turns out that there is no suitable calibrator this close by. 
This problem gets increasingly worse as one goes to lower frequencies since the excess 
ionospheric phase increases as v~ 2 . As discussed in Chapter 5 therefore, as long as the 
excess ionospheric phase is constant over the field of view, this phase can be lumped in 
with the electronic phase of receiver chain, and can be solved for using self-calibration. 

However, for a given antenna, as one observes at lower and lower frequencies, the field 
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Figure 16.3: For short enough baselines, the isoplantic assumption holds even if the field 
of view is larger than the typical coherence length of the ionospheric irregularities. This 
is because both arms of the interferometer get essentially the same excess phase. 

of view increases as v~ x . Since the excess ionospheric phase is also increasing rapidly 
with decreasing frequency, one will soon hit a point where the assumption that the excess 
phase is constant over the field of view is a poor one. At this point the self-calibration 
algorithm is no longer applicable. Variations of the ionospheric phase over the field of view 
are referred to as “non isoplanaticity”. As illustrated in Figure 16.3, when the baseline 
length is small compared to the typical length scale of ionospheric density fluctuations, 
even though the ionospheric phase is different for different sources in the field of view, the 
excess phase is nearly identical at both ends of the baseline. Since interferometers are 
sensitive only to phase differences between the two antennas, the isoplanatic assumption 
still holds. The non isoplanaticity problem hence arises only when the baselines as well 
as the field of view are sufficiently large. For the GMRT, isoplanaticity is often a poor 
assumption at frequencies of 325 MHz and lower. 


16.5 Angular Broadening 



Figure 16.4: Angular broadening. 

As discussed in the previous sections, the small scale fluctuations of electron density in 
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the ionosphere lead to an excess phase for a radio wave passing through it. This excess 
phase is given by 

2tt f 

<j>(x) = — / A pdz, 

4>{x) = CX j An(x,z)dz, 

where A p is the change in refractive index due to the electron density fluctuation, C is 
a constant and A n(x, z) is the fluctuation in electron density at the point (x,z) and the 
integral is over the entire path traversed by the ray (see Figure 16.4). 

If we assume that 0(x) is a zero mean Gaussian random process, with auto-correlation 
function given by b 2 /Trj, where p(r') = e~ r //2 "'®, then from the relation above for <j>(x) we 
can determine that <t>l °c A 2 A n 2 L, where L is the total path length through the ionosphere 2 . 
Let us assume that a plane wavefront from an extremely distant point source is incident 
on the top of such an ionosphere. In the absence of the ionosphere the wave reaching the 
surface of the earth would also be a plane wave. For a plane wave the correlation function 
of the electric field (i.e. the visibility) is given by (Ei(x)EJ*(x + r)) = A', 2 , i.e. a constant 
independent of r. On passage through the ionosphere however, different parts of the wave 
front acquire different phases, and hence the emergent wavefront is not plane. If E(x) 
is the electric f ield at some point on the emergent wave, then we have E(x) = E t e^" l,(x K 
Since E t is just a constant, the correlation function of the emergent electric field is 

(E(x)E*(x + r))= £?(e- < (*( I >-*( I+r >>). 

From our assumptions about the statistics of (j>{x) this can be evaluated to give 

(E{x)E*(x + r)) = Efe- 2 ^ 1 -^. (16.5.6) 

If 4>q is very large, then the exponent is falls rapidly to zero as (1 — p(r)) increases (or 
equivalently when r increases). It is therefore adequate to evaluate it for small values of 
r, for which p{r) can be Taylor expanded to give p(r) ~ 1 — 1 /2r 2 /« 2 . and we get 

_ ,2 r 2 

(E(x)E*(x + r)) = E?e 

The emergent electric field hence has a finite coherence length (while the coherence length 
of the incident plane wave was infinite). From the van Cittert-Zernike theorem this is 
equivalent to saying that the original unresolved point source has got blurred out to a 
source of finite size. This blurring out of point sources is called “angular broadening” 
or “scatter broadening”. If we define a = a^/bo then the visibilities have a Gaussian 
distribution given by e~ ir /“ , meaning that the characteristic angular size 0 scat of the 
scatter broadened source is ~ A/a oc A 2 v / An 2 L. 0 scat is called the “scattering angle”. 

On the other hand if b/ is small then the exponent in eqn 16.5.6 can be Taylor ex¬ 
panded to give 

(E(x)E*(x + r)) = Ef [l — 2 <^>q( 1 — p(r))], 

-r 2 

= Ef [(1 — 2</>o) + 2</>Qe^"]. 

This corresponds to the visibilities from an unresolved core (of flux density Ef (1 — 2</>q)) 
surrounded by a weak halo. 

2 This follows from the equation for <j>(x) If you also assume that < A n(x, z)An(x, z ) > = An 2 S(z, z ). 
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16.6 Scintillation 


Incoming plane wave 



Earth’s surface 

Figure 16.5: Scintillation due to the ionosphere 

In the last section we dealt with an ionosphere which had random density fluctuations 
in it. In the model we assumed the density was assumed to vary randomly with position, 
but not with time. In the earth’s ionosphere however, the density does vary with both 
position and time. Temporal variations arise both because of intrinsic variation as well 
as because of traveling disturbances in the ionosphere, because of which a given pattern 
of density fluctuations could travel across the line of sight. 

This temporal variation of the density fluctuations means that the coherence function 
(even at some fixed separation on the surface of the earth) will vary with time. This phe¬ 
nomena is generically referred to as “scintillation”. Depending on the typical scattering 
angle as well as the typical height of the scattering layer from the surface of the earth, 
the scintillation could be either “weak” or “strong”. 

As discussed in the previous section, rays on passing through an irregular ionosphere 
get scattered by a typical angle 9 sca t • If the scattering occurs at a height h above the 
antennas, then as shown in Figure 16.5 these scattered rays have to traverse a further 
distance h before being detected. The transverse distance traveled by a scattered ray is 
~ hO scat . If this length is much less than the coherence length a, then the rays scattered 
by different irregularities in the scattering medium do not intersect before reaching the 
ground. The corresponding condition is that h9 scat < a, i.e. h9 scat < X/9 scat or h9 2 scat < A. 

If this condition holds, then, at any instant of time, (as discussed in the previous 
section), what the observer sees is an undistorted image of the source, which is shifted 
in position due to refraction. As time passes, the density fluctuations change 3 and so 

3 but we assume that their statistics remain exactly the same, i.e. that they continue to be realization of a 
Gaussian random process with variance <j >o and auto-correlation p(r) 
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the image appears to wander in the sky and in a long exposure image which averages 
many such wanderings, the source appears to have a scattered broadened size 6 scat . 
Provided that one can do self calibration on a time scale that is small compared to the 
time scale of the “image wander”, this effect can be corrected for completely. On the 
other hand, when the h,6^ cat > A the rays from different density fluctuations will intersect 
and interfere with one another. The observer sees more than one image, and because of 
the interference, the amplitude of the received signal fluctuates with time. This is called 
“amplitude” scintillation. Amplitude scintillation at low frequencies, particularly over the 
Indian subcontinent can be quite strong. The source flux could change be factors of 2 
or more on very short timescales. This effect cannot be reliably modeled and removed 
from the data, and hence observations are effectively precluded during periods of strong 
amplitude scintillation. 


16.7 Further Reading 

1. Interferometry and Synthesis in Radio Astronomy; Thompson, A. Richard, Moran, 
James M., Swenson Jr., George W.; Wiley-Interscience Publication, 1986. 

2. Synthesis Imaging In Radio Astronomy; Eds. Perley, Richard A., Schwab, Frederic 
R., and Bridle, Alan H.; ASP Conference Series, Vol 6. 



Chapter 17 

Pulsar Observations 

Yashwant Gupta 


17.1 Introduction 

Amongst the various kinds of sources observed in Radio Astronomy, pulsars are per¬ 
haps the most unique kind, from many points of view. A pulsar is a neutron star - the 
ultra-dense core that remains after a massive star undergoes a supernova explosion - 
spinning at very rapid rates ranging from once in a few seconds to as much as ~ 1000 
times per second. A pulsar has a magnetosphere with a very high value of the mag¬ 
netic field (~ 10 6 - 10 9 Gauss). The emission mechanism (which is not understood yet) 
produces radio frequency radiation that comes out in two beams, one from each pole 
of the magnetosphere. These rotating beams of radiation are seen by us whenever they 
intersect our line of sight to the pulsar, much like a lighthouse on the sea-shore. Each 
rotation of the pulsar thus produces a narrow pulse of radiation that can be picked up 
by a radio telescope. Several properties of pulsars - such as their ultra-compact size, the 
occurrence of narrow duty cylce pulses with highly stable periods, intensity fluctuations 
on very short time scales and high degree of polarisation of the radiation - make for a 
set of observation and data analysis techniques that are very different from those used in 
radio interferometry. Here we take a look at these special techniques in some detail. 


17.2 Requirements for Pulsar Observations 

17.2.1 Phased Array Requirements 

Like all radio sources, the sensitivity of pulsar observations benefits from the availability 
of a large collecting area. However, because of the compact nature of the source of radia¬ 
tion (typically a few hundred kilometers across), a pulsar is effectively a point source for 
the largest interferometer baselines on the Earth. Hence, there is not much to be learnt 
from making a map of a pulsar! This means that single dish observations are enough for 
pulsar work. However, since pulsars are relatively weaker sources (typical average flux 
densities < 100 mJy, large collecting areas are very useful and hence array telescopes are 
used for this advantage. These array telescopes are not used in the interferometer mode, 
but in the phased array mode (see chapter 6). This means that much of the complicated 
hardware of the correlator required for measuring the visibilities on all baselines is not 
needed. In phased array mode, pulsar observations can be carried out in two different 
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ways : (i) incoherent phased array observations and (ii) coherent phased array observa¬ 
tions. In the incoherent phased array mode, the signal from each antenna is put through 
a detector and the output from these is added to obtain the net signal. In coherent phased 
array mode, the voltage signal from each antenna is added and the summed output is 
put through a detector to obtain the final power signal. For an array of N antennas, the 
incoherently phased array gives a sensitivity of t/N times that of a single antenna, while 
the coherent array gives a sensitivity of N times that of a single antenna. The incoherent 
array has an effective beam that is same as that of a single antenna of the array, whereas 
the coherent array has a beam width that is much narrower than that of a single antenna, 
being ~ X/D, where D is the largest spacing between antennas in the array. The coherent 
phased array mode is ideally suited for observations of known pulsars. The incoherent 
phased array mode is most useful for large scale pulsar search observations, where the 
aim is to cover a maximum area of the sky in a given time, at a given level of sensitivity. 
For a sparsely filled aperture array, incoherent phased array observations will certain be 
faster for such applications. 

17.2.2 Spectral Resolution Requirements 

Again like all radio sources, pulsar observations also benefit from large bandwidths of 
observation. However, unlike any other kind of continuum radio source, pulsar observa¬ 
tions can not often combine the data from across a large bandwidth in a single detector. 
This is mainly because of the smearing of the pulses produced by differential dispersion 
delay of frequencies across the band, due to propagation of the pulsar signal through the 
interstellar medium. This is explained in some detail in section 4 below. In the simplest 
technique for reducing the effect of dispersion delay smearing, the pulsar signal is pro¬ 
cessed in a multichannel receiver where the observing band is broken up into narrower 
frequency channels. The signal in each channel is detected and acquired separately. This 
requirement of narrower frequency channels across the observing band makes a pulsar 
receiver similar to a spectral line receiver, though for entirely different reasons. 

17.2.3 Requirements for Time Resolution and Accurate Time Keep¬ 
ing 

Unlike other radio sources which are taken to be statistically constant in their strength 
as a function of time, pulsar signals are intrinsically periodic signals. The pulses have 
periods ranging from a few seconds for the slowest pulsars to about a millisecond for 
the fastest pulsars known. Further, the pulses have a very small duty cycle, with typical 
pulse widths of the order of 5 — 10% of the period. Thus typical pulse widths range from a 
few tens of milliseconds down to a fraction of a millisecond. Study of such pulsar signals 
clearly requires the final data to have time resolutions ranging from ~ milliseconds to ~ 
microseconds. Pulsar observations thus require very fast sampling times for the data. 
This leads to a substantial increase in the speed (and therefore complexity) of the back¬ 
end designed for pulsar observations and also in the speed of the data acquisition system 
and off-line computing capabilities. Also, the value of the sampling interval needs to be 
known quite accurately in order to preserve the pulse phase coherence over a long stretch 
of pulsar data spanning many periods. 

The other property of the time variation of pulsar signals is that the rotation rate of 
pulsars is very accurate. This means that if the time of arrival of the Nth and (N+l)th 
pulses is known, the arrival time for the (N+M)th pulse can be predicted very accurately. 
Further, slow variations of the pulsar period (for example due to rotational slow down 
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of the pulsar) can be studied if the absolute time of arrival of the pulses can be mea¬ 
sured sufficiently accurately. This requires the availability of a very precise clock at the 
observatory, such as that provided by a GPS receiver (see section 17.7 for more details). 

17.2.4 Requirements for Polarimetry 

Radiation from pulsars has been shown to be highly polarised. The linear polarisation 
can at times reach close to 100%. Significant amounts of circular polarisation is also seen 
frequently. The study of these polarisation characteristics is very important for under¬ 
standing the emission mechanism of pulsars. Hence pulsar studies often require that 
the telescope support full polarisation observations that finally yield the four Stokes pa¬ 
rameters, as a function of time and frequency. Remember that each of these polarisation 
parameters needs to satisfy all the time and frequency resolution criteria outlined above, 
leading to a four fold increase in hardware complexity and data flow rate over simple total 
power observations. 

17.2.5 Flux Calibration Requirements 

The intensity of individual pulses varies randomly over various time scales. On the short¬ 
est time scale, pulse to pulse intensity fluctuations are thought to be due to intrinsic 
processes in the pulsar magnetosphere. Longer time scale fluctuations in the mean pul¬ 
sar flux are produced by propagation processes in the ionised plasma of the interstellar 
medium (ISM). Furthermore, some of these intensity fluctuations can be uncorrelated 
over large frequency intervals. Thus for purposes of estimating the pulsar flux (includ¬ 
ing estimates of the spectral index) and for studying the variations in the pulsar flux to 
understand properties of the ISM, pulsar observations need to be calibrated with known 
sources of power. This can be done by using either calibrated noise sources that can be 
switched into the signal path or known calibration sources in the sky. 


17.3 Basic Block Diagram of a Pulsar Receiver 

Incorporating the above requirements into a realistic set-up for pulsar observations leads 
to the following block level diagram for pulsar observations (see Fig 17.1). In a modern 
radio telescope, most of the processing of the signals is carried out in the digital domain, 
after down conversion to a baseband signal (of bandwidth B). Hence the first block is an 
analog to digital convertor (ADC), which is run on an accurate and controlled sampling 
clock. For multi-element or array telescopes, the signals from the different elements 
need to be phased. This involves proper adjustments of amplitude, delay and phase of 
the signals (see chapter 6). The output of this block is the phased array signal which 
goes to the ‘Spectral Resolution Block’. For a single dish telescope, the signal comes 
directly from the sampler to this block. This block produces the multiple narrow-band 
channels from the single broad-band data. This can be achieved using a filter bank 
or a FFT spectrometer or an auto-correlation spectrometer. The output is a baseband 
voltage signal for each of N ch frequency channels, sampled at the Nyquist rate. For a 
multi-element telescope, the location of this block and the Phased Array block can be 
interchanged, in part or in whole. For example, at the GMRT, the integer sample delay 
correction is done before the FFT; the fractional sample delay correction and the phase 
correction is done in the last stage of the FFT and the addition of the signals is done in 
a separate block located after the FFT. Note that for incoherent phased array operation 
to be possible, the addition of the signals MUST be after the spectral resolution block, 
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BLOCK LEVEL DIAGRAM FOR A PULSAR RECEIVER 


Single Dish Aj A^ . A n 



Figure 17.1: Block diagram of a typical pulsar receiver 


because the square law detection has to be carried out before the incoherent addition can 
be done. 

The second orthogonal polarisation from the telescope is also processed similarly till 
the output from the spectral resolution block. These outputs can then be given to two 
different kinds of processors. The first is a total power adder that simply adds the powers 
of the signals in the two polarisations to give a measure of the total intensity from the 
telescope as a function of time and freqency. The second is a polarimeter that takes the 
voltage signals from the two polarisations and produces the four Stokes parameters, as a 
function of time and frequency. The data from the incoherent phased array, for example, 
can only be put through the total power path. The outputs from these two processors 
are then put through an adder that integrates the data to the required time constant, 
t s. The final output going to the recorder then is either one (total intensity mode) or four 
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(polarimetry mode) signals each containing N ch frequency channels coming at the rate of 
1 /t s samples per second. The net data rate into the recorder is then N cIi /t s samples per 
second for the total intensity mode and four times as much for the polarimetry mode. As 
an example, if data from 256 spectral channels is being acquires with a time constant of 
0.25 millisec, the data rate is 1 mega samples per second for the total intensity mode. If 
one sample is stored as a two byte word, we can see that a storage space of 1 gigabyte 
would get filled with about 2 minutes of data! In cases where the data rate going into 
the recorder in the above set-up is difficult to handle for storage or off-line processing, 
special purpose hardware to do some of the processing on-line can also be used. Typical 
examples of such processing would be on-line dedispersion, on-line folding at the pulsar 
period and on-line gating of the data (to pass on only some region of each pulsar period 
that is around the on-pulse region). Each of these techniques reduces the net data rate 
so that it can be comfortably acquired and further processed off-line. The choice of the 
processing technique depends on the scientific goals of the observations. 


17.4 Dispersion and Techniques for its Correction 


As mentioned earlier, propagation of pulsar signals through the tenuous plasma of the 
ISM produces dispersion of the pulses. This is because the speed of propagation through 
a plasma varies with the frequency of the wave (see chapter 16). Low frequency waves 
travel progressively slowly, with a cut-off in propagation at the plasma frequency. At high 
frequencies, the velocity reaches the velocity of light asymptotically. The difference in 
travel time between two radio frequencies fi and / 2 is given by 


t d = K DM (Js ~ , (17.4.1) 

where DM = / n e dl is the dispersion measure of the pulsar, usually measured in the 
somewhat unusual units of pccm~ 3 , and K = 4.149 x 10 6 is a constant. In this equation, 
td is in units of millisec and /i, / 2 are in units of MHz. For the typical ISM, a path length 
of 1 kiloparsec amounts to a DM of about 30 pccm~ 3 . Equation (1) can be used to derive 
the following approximate relationship for the dispersion smear time for a bandwidth B 
centred at a frequency of observation f 0 , for the case when /i <C ,/o 


7~disp 



3 

DM B 


(17.4.2) 


where T disp is in millisec, f 0 and B are in MHz and DM is in the units given earlier. Inter¬ 
stellar dispersion degrades the effective time resolution of pulsar data due to smearing, 
and this effect becomes worse with decreasing frequency of observation. For example, 
the dispersion smear time is about 0.25 millisec per MHz of bandwidth per unit DM at an 
observing frequency of 325 MHz. This means that a pulse of 25 millisec width would be 
broadened to twice its true width when observed with a bandwidth of 10 MHz, for a DM 
of 10 pc an~ 3 . Even worse, signal from a pulsar of period 25 millisec would be completely 
smeared out and not be visible as individual pulses. Thus it is important to reduce the 
effect of interstellar dispersion in pulsar data. This is called dedispersion. 

There are two main methods used for dedispersion - incoherent dedispersion and co¬ 
herent dedispersion. In incoherent dedisperion, which is a post-detection technique, the 
total observing band (of bandwidth B) is split into N ch channels and the pulsar signal is 
acquired and detected in each of these. The dispersion smearing in each channel is less 
than the total smearing across the whole band, by a factor of N ch . The detected signal 
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from each channel is delayed by the appropriate amount so that the dispersion delay be¬ 
tween the centers of the channels is compensated. These differentially delayed data trains 
from the N ch channels are added to obtain a final signal that has the dispersion smearing 
time commensurate with a bandwidth of B/N ch , thereby reducing the effect of dispersion. 
In practical realisations of this scheme, the splitting of the band into narrow channels 
is usually carried out on-line in dedicated hardware (as described in section 17.3) while 
the process of delaying and adding the detected signals from the channels can be done 
on-line using special purpose hardware or can be carried out off-line on the recorded, 
multi-channel data. In this scheme, the final time resolution obtained for a given pulsar 
observation is limited by the number of frequency channels that the band is split into. 

In coherent dedispersion, one attempts to correct for interstellar dispersion in a pul¬ 
sar signal of bandwidth B before the signal goes through a detector, i.e. when it is still a 
voltage signal. It is based on the fact that the effect of interstellar scintillation on the elec¬ 
tromagnetic signal from the pulsar can be modelled as a linear filtering operation. This 
means that, if the response of the filter is known, the original signal can be deconvolved 
from the received voltage signal by an inverse filtering operation. The time resolution 
achievable in this scheme is 1 /B - the ma xi mum possible for a signal of bandwidth B. 
Thus coherent dedispersion gives a better time resolution than incoherent dedispersion, 
for the same bandwidth of observation. It is the preferred scheme when very high time 
resolution studies are required - as in studies of profiles of millisecond pulsars and mi- 
crostructure studies of slow pulsars. The main drawback of coherent dedispersion is 
that practical realisations of this scheme are not easy as it is a highly compute intensive 
operation. This is because the duration of the impulse response of the dedispersion filter 
(equal to the dispersion smear time across the bandwidth) can be quite long. To reduce 
the computational load, the deconvolution operation of the filtering is carried out in the 
Fourier domain, rather than in the time domain. Nevertheless, real time realisations of 
this scheme are limited in their bandwidth handling capability. Most coherent dedisper¬ 
sion schemes are implemented as off-line schemes where the final baseband signal from 
the telescope is recorded on high speed recorders and analysed using fast computers. 


17.5 Pulse Studies 

Pulsar pulse studies encompass a broad set of topics ranging from the study of the av¬ 
erage properties of pulsar profiles to the study of microscopic pheonomena in individual 
pulses. Though individual pulses from a pulsar show tremendous variations in properties 
such as shape, width, amplitude and polarisation, it is found that when a few thousand 
pulses (typically) are accumulated synchronously with the pulsar period, the resulting av¬ 
erage profile shows a steady and constant form which can be considered to be a signature 
of that pulsar. Such an average profile typically exhibits one or more well defined regions 
of emission within the profile window. These are usually referred to as emission compo¬ 
nents and they can be partially or completed separated in pulse longitude. Similarly, the 
average polarisation properties also show a well defined signature in terms of the vari¬ 
ations (across the profile window) of the amplitudes of linear and circular polarisation, 
as well as the angle of the linear polarisation vector. The average profile however does 
change with observing frequency for a given pulsar, with the typical signature being that 
profiles become wider at lower frequencies. Average pulse profile studies are important 
for characterising the overall properties of a pulsar. 

To obtain accurate average pulse profiles, one needs to observe the pulsar for a long 
enough stretch so that (i) the profile converges to a stable form and (ii) there is enough 
signal to noise. The time resolution should be enough to resolve the features of interest in 
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the profile (typically 1% to 0.1% of the pulse period). Since the average profile is obtained 
by synchronous accumulation at the pulsar period (this is called ‘folding’ in pulsar jar¬ 
gon), the period and the sampling interval need to be known with sufficient accuracy to 
avoid any distortions due to smearing effects. It is easy to show that the fractional error 
in the period and the resultant fractional error in phase are related by 


A P _ 1 A <f> 

~p~ ~ 


(17.5.3) 


where N p is the number of pulses used in the folding. As an example, if the distortions 
due to phase error are to be kept under one part in a thousand and N p = 1000, then the 
period needs to be known to better than 1 part in a million. 

Let us know look at the signal to noise ratio (SNR) for an average profile observation. 
For a pulsar of period P and pulse width W having a time average flux S av , observed with a 
telescope of effective aperture A e ff and system noise temperature T sys , using a bandwidth 
B and time constant r s , the signal to noise ratio at a point on a profile obtained from N p 
pulses is given by 

SNR avg = Sa ^ff L JW p . (17.5.4) 

-L sys V* 

Here the P/W term is to convert the time average flux to on-pulse flux and the \/Np term 
accounts for the SNR improvement due to addition of N p pulses. The other terms are as 
for normal SNR calculations for continuum sources. 

When single pulses from a pulsar are examined in detail, it is seen that the radiation 
in each pulse does not always occur all over the average profile profile window. Usually, 
the signal is found located sporadically at different longitudes in the profile window. 
These intensity variations are called sub-pulses and they have a typcial width that is less 
than the width of the average profile. For some pulsars, sub-pulses in successive pulses 
don’t always occur randomly in the profile window; they are found to move systematically 
in longitude from one pulse to the next. These are called drifting sub-pulses and are 
thought be one of the intriguing features of the emission mechanism. For some pulsars, 
there are times when there is practically no radiation seen in the entire profile window 
for one or more successive pulses. This phenomenon is called nulling and is another of 
the unexplained mysteries of pulsar radiation. Polarisation properties of sub-pulses also 
show significant deviations from the overall polarisation properties of the average profile. 
Studies of sub-pulses require time resolutions that are 0.1% of the pulse period, or better. 

When single pulses are observed with still further time resolution, it is found that 
narrow bursts of emission are also seen with time scales much shorter than sub-pulse 
widths. This is called microstructure and the time scales go down to microseconds and 
less. Seeing pulsar microstructure almost always requires the use of coherent dedis¬ 
persion techniques to achieve the desired time resolution. It is clear from the above that 
pulsar intensities show fluctuations at various time scales within a pulse period. A useful 
analysis technique that separates out the various time scales is the intensity correlation 
function. 

It is worth pointing out that single pulse observations are the worst affected among 
all kinds of pulsar studies, from the point of view of signal to noise ratio. This is simply 
because the -^/Np advantage in equation (3) is not available. Also, as t s is reduced for 
higher time resolution studies, the SNR decreases further. Hence such studies need the 
largest collecting area telescopes and can often be done on only the strongest pulsars. 
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17.6 Interstellar Scintillation Studies 

The propagation of pulsar signals through the interstellar medium of the Galaxy modifies 
the properties of the received radiation in several ways. A study of these effects can give 
useful information about the interstellar medium. One of these effects that has already 
been looked at is interstellar dispersion. It gives us information about the mean electron 
density of the interstellar plasma. 

Another effect that is significant in pulsar observations is interstellar scintillations. It 
is caused by scattering of the radiation due to random fluctuations of electron density in 
the interstellar plasma. It produces the following effects (not all of which are easily ob¬ 
servable!): (i) angular broadening of the source, as scattered radiation now arrives from a 
range of angles around the direction to the pulsar; (ii) temporal pulse broadening due to 
the delayed arrival of scattered radiation; (iii) random fluctuations of pulsar intensity as a 
function of time and frequency due to interference effects between radiation arriving from 
different directions. All these effects increase in strength with decreasing frequency and 
with increasing length of plasma between source and observer. A detailed study of inter¬ 
stellar scintillation effects in pulsar signals can be used to obtain valuable constraints on 
the extent and location of scattering plasma in the interstellar medium, as well as on the 
spatial power spectrum of electron density fluctuations in the medium. 

Of the three effects of scintillations described above, the random fluctuations of inten¬ 
sity are the most easily observable and form the best probes of the phenomenon. They 
are readily seen in pulsar dynamic spectra which are records of the on-pulse intensity as 
a function of time and frequency. A single time sample in the dynamic spectra is obtained 
by accumulating the total energy under the pulse window for a given number of pulses, 
for each of N ch channels. These random intensity fluctuations have typical decorrelation 
scales in time and frequency, which are estimated by performing a two dimensional au¬ 
tocorrelation of the dynamic spectra data. These decorrelation widths are of the order 
of a few minutes and hundreds of kHz, respectively, at metre wavelengths. This means 
that typical observations have to be carried out with time and frequency resolutions of 
the order of tens of seconds and tens of kHz in order to observe the scintillations. This 
requirement becomes more stringent at lower frequencies and for more distant pulsars 
(which are more strongly scattered). Also, the observations need to span enough number 
of these random scintillations in order to obtain statistically relaible values for the two 
decorrelation widths. This usually requires observing durations of an hour or so with 
bandwidths of a few MHz. 

Due to the effect of large scale electron density fluctuations in the interstellar medium, 
the values of the decorrelation widths and the mean pulsar flux, fluctuate with time. A 
study of this phenomenon (called refractive scintillations) requires regular monitoring 
of pulsar dynamic spectra at different epochs, typically a few days apart and spanning 
several weeks to months. Such data can also be used to estimate the mean transverse 
speeds of pulsars. 


17.7 Pulsar Timing Studies 

Pulsar timing studies involve accurate measurements of the time of arrival of the pulses, 
followed by appropriate modelling of the observed arrival times to study and understand 
various phenomena that can effect the arrival times. 

The first step of accurate estimation of arrival times is achieved as follows. First, 
at each epoch of observation, data from the pulsar is acquired with sufficient resolu¬ 
tion in time and frequency and over a long enough stretch so that a reliable estimate of 
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the average profile can be obtained. The effective time resolution should be about one- 
thousandth of the period. Second, the absolute time for at least one well defined point 
in the observation interval is measured with the best possible accuracy. Traditionally, 
atomic clocks have been used for this purpose. With the advent of the Global Positioning 
System (GPS), absolute time (UTC) tagging with an accuracy of ~ 100 nanosec is possible 
using commercially available GPS receivers. Third, the fractional phase offset with re¬ 
spect to a reference epoch is calculated for the data at each epoch. This is generally best 
achieved by cross-correlating the average profile at the epoch with a template profile and 
estimating the shift of the peak of the cross-correlation function. This shift, in units of 
time, is added to the arrival time measurement to reference the arrival times to the same 
phase of the pulse. Fourth, the arrival times measured at the observatory on the Earth 
are referred to a standard inertial point, which is taken as the barycenter of the solar 
system. These corrections include effects due to the rotation and revolution of the Earth, 
the effect of the Earth-Moon system on the position of the Earth and the effect of all the 
planets in the Solar System. Relativistic corrections for the clock on the Earth are also 
included, as are corrections for dispersion delay at the doppler corrected frequency of ob¬ 
servation. Last, a pulse number, relative to the pulse at the reference epoch, is attached 
to the arrival time for each epoch. This can be a tricky affair, since to start with the pulsar 
period may not be known accurately, and it is possible to err in integer number of pulses 
when computing the pulse number. To avoid this danger, a boot-strapping technique is 
used where the initial epochs of observations are close enough so that, given the accuracy 
of the period, the phase error can not exceed one cycle between two successive epochs. 
As the period gets determined with better accuracy by modelling the initial epochs, the 
spacing between successive epochs can be increased. The net result of the above exercise 
is a series of data pairs containing time of arrival and pulse number, both relative to the 
same starting point. 

The second step in the analysis is the modelling of the above data points. This is 
usually done by expressing the pulse phase at any given time in terms of the pulsar 
rotation frequency and its derivatives as follows 


4>i — <A) + v$ti + fioii 2 /2 + • • • , (17.7.5) 

where = 1/P. Least squared fits for b 0 - vq, etc., can be obtained from such a model. 
In addition, by examining the residuals between the model and the data, other parameters 
that effect the pulsar timing can be estimated. These include errors in the positional 
estimate of the pulsar, its proper motion, perturbations to the pulsar’s motion due to 
the presence of companions, sudden changes in the pulsar’s rotation rate etc. In fact, 
good quality timing observations can be used to extract a wealth of information, including 
stability of pulsars vis-a-vis the best terrestial clocks! 


17.8 Pulsar Search 

At the end, we come to the observation and analysis techniques used for discovering new 
pulsars. Pulsar searches fall into one of two broad categories : targeted and untargeted 
searches. In an untargeted search (or survey) for pulsars, the idea is to uniformly cover 
a large area of the sky with a desired sensitivity in flux level. In targeted searches, one 
is searching a limited area of the sky where there is a higher than normal possibility of 
finding a pulsar (for example, the region in and around a supernova remnant or a steep 
spectrum point source identified in mapping studies). Here some of the parameters of the 
search can be tailored to suit the a priori knowledge about the search region. 
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For a pulsar survey, the choice of (i) the range of directions to search in, (ii) the fre¬ 
quency of observations, (iii) the bandwidth and number of spectral channels, (iv) the 
sampling interval and (v) the duration of the observations are some of the critical items 
that need to be chosen carefully. The choice of these parameters is interlinked in many 
cases. 

Analysis of pulsar search data is an extremely compute intensive task. For each 
position in the sky for which data is recorded, the analysis technique needs to search 
for the presence of a periodic signal in the presence of system noise. However, from the 
discussion in section 3, it is clear that if appropriate dispersion correction is not done, 
the sensitivity to the presence of a periodic signal can be reduced significantly. Since a 
pulsar can be located at any distance (and hence DM) along a given direction in the sky, 
the search has to be carried out in (at least) two dimensions : DM and period. For this, 
the data is dedispersed for different trial dispersion measures. For each choice of DM, 
the dedispersed data is search for a periodic signal. 

To reduce the computational load for search data analysis, several optimised algo¬ 
rithms are used. For example, when dedispersing for a range of DM values, it is possible 
to use the results from the computations for some DM values to compute part of the re¬ 
sults for some other DM values. This saves a lot of redundant calculations. This method, 
known as Taylor’s Dedispersion Algorithm, is used quite often. Similarly, there are opti¬ 
mised techniques for searching for periodic signals in the presence of noise. The simplest 
method is to fold the dedispersed data for each choice of possible period and examine 
the resulting profile for the presence of a significant peak that is well above the noise 
level. Once again, computations done for folding at a given period can be used for folding 
at other periods. This redundancy is exploited by the Fast Folding Algorithm. A signal 
containing a periodic train of pulses gives a well defined signature in the Fourier domain 
- its spectrum consists of peaks at the frequency corresponding to the periodicity, and 
harmonics thereof. It can be shown that it is possible to detect the periodic signal by 
searching for harmonically related peaks in the spectral domain. It turns out that it is 
more economical to implement the FFT followed by harmonic search technique compared 
to the folding search techniques. 

Additional complications are introduced in the search algorithm when one allows the 
parameter space to cover pulsars in binary orbits as the period can actually change during 
the interval of observation. Special processing techniques are needed to handle such 
requirements. 


17.9 Further Reading 
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18.1 Introduction 

The Giant Metrewave Radio Telescope (GMRT) consists of an array of 30 antennas. Each 
antenna is 45 m in diameter, and has been designed to operate at a range of frequencies 
from 50 MHz to 1450 MHz. The antennas have been constructed using a novel technique 
(nicknamed SMART) and their reflecting surface consists of panels of wire mesh. These 
panels are attached to rope trusses, and by appropriate tensioning of the wires used 
for attachment the desired parabolic shape is achieved. This design has very low wind 
loading, as well as a very low total weight for each antenna. Consequently it was possible 
to build the entire array very economically. In this chapter I give a very brief overview of 
the GMRT. Subsequent chapters discuss in detail each of the major subsystems of the 
GMRT. 


18.2 Array Configuration 

The GMRT has a hybrid configuration, (see Figure 18.1) with 14 of its antennas randomly 
distributed in a central region (~ 1 km across), called the central square. The distri¬ 
bution of antennas in the central square was deliberately “randomized” to avoid grating 
lobes. The antennas in the central square are labeled as Cnn, with nn going from 00 
to 14 (i.e. COO, C01.....C14) 1 . The remaining antennas are distributed in a roughly Y 
shaped configuration, with the length of each arm of the Y being ~ 14 km. The ma xi mum 
baseline length between the extreme arm antennas is ~ 25 km. The arms are called the 
“East” “West” and “South” arms and the antennas in these arms are labeled E01..E06, 
W01..W06 and S01..S06 for the east, west and south arm respectively. 

The central square antennas provide a large number of relatively short baselines. This 
is very useful for imaging large extended sources, whose visibilities are concentrated near 
the origin of the UV plane. The arm antennas on the other hand are useful in imaging 
small sources, where high angular resolution is essential. A single GMRT observation 
hence yields information on a variety of angular scales. 

^he array was originally meant to have 34 antennas, but because of escalating costs, was finally constructed 
with 30. Consequently some antenna stations do not actually have any antennas In them, resulting in “missing” 
numbers (C07, E01, S05) in the numbering sequence. 
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LOCATIONS OF GMRT ANTENNAS (30 dishes) 



Figure 18.1: GMRT array configuration. 


18.3 Receiver System 

The GMRT currently operates at 5 different frequencies ranging from 150 MHz to 1420 MHz 
Some antennas have been equipped with receivers which work up to 1750 MHz. Above 
this frequency range however, the antenna performance degrades rapidly both because 
the reflectivity of the mesh falls and also because of the rapidly increasing aperture phase 
errors because of the deviations of the plane mesh facets from a true parabola. A 50 MHz 
receiver system is also planned. Table 18.1 gives the relevant system parameters at the 
nominal center frequency of the different operating frequencies of the GMRT. 

The GMRT feeds, (except for the 1420 feed), are circularly polarized. The circular 
polarization is achieved by means of a polarization hybrid inserted between the feeds and 
the RF amplifiers. No polarization hybrid was inserted for the 1420 MHz feed, in order to 
keep the system temperature low. None of the receivers are cooled, i.e. they all operate 
at the ambient temperature. The feeds are mounted on four faces of a feed turret placed 
at the focus of the antenna. The feed turret can be rotated to make any given feed point 
to the vertex of the antenna. The feed on one face of the turret is a dual frequency feed, 
i.e. it works at both 233 MHz as well as 610 MHz. 

After the first RF amplifier, the signals from all the feeds are fed to a common sec¬ 
ond stage amplifier (this amplifier has an input select switch allowing the user to choose 
which RF amplifier’s signal is to be selected), and then converted to IF. Each polarization 
is converted to a different IF frequency, and then fed to a laser-diode. The optical sig- 
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System Properties 

50 153 

233 

in MHz 
327 
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Primary beam (degree) 

3.8 
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1.8 
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0.4x(1400/f) 

Synthesized beam 

Full array (arcsec) 

20 

13 

09 

05 

02 

Central array (arcmin) 

7.0 
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1.7 

0.7 

System temperature (K) 
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55 

50 

60 

40 

(including cable losses) 

(2) 'A round — Tmesh f spi 1 lover 
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Total T sys 

30 

23 
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22 

32 
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99 

40 

10 

4 
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92 

76 

~ f sky T rece i ver + Tground 

Gain of an antenna (K/Jy) 

0.33 

0.33 

0.32 

0.32 

0.22 

RMS noise in image* (yJy) 

46 

17 

10 

09 

13 


*For assumed bandwidth of 16 MHz, integration of 10 hours and natural weighting 
(theoretical). 


Table 18.1: System parameters of the GMRT 


nals generated by the laser-diode are transmitted to a central electronics building (CEB) 
by fiber optic cables. At the central electronic building, they are converted back into 
electrical signals by a photo-diode, converted to baseband frequency by another set of 
mixers, and then fed into a suitable digital backend. Control and telemetry signals are 
also transported to and from the antenna by on the fiber-optic communication system. 
Each antenna has two separate fibers for the uplink and downlink. 


18.4 Digital Backends 

There are a variety of digital backends available at the GMRT. The principle backend 
used for interferometric observations is a 32 MHz wide FX correlator. The FX correlator 
produces a ma xi mum of 256 spectral channels for each of two polarizations for each 
baseline. The integration time can be as short as 128 ms, although in practice 2 sec 
is generally the shortest integration time that is used. The FX correlator itself consists 
of two 16 MHz wide blocks, which are run in parallel to provide a total instantaneous 
observing bandwidth of 32 MHz. For spectral line observations, where fine resolution may 
be necessary, the total bandwidth can be selected to be less than 32 MHz. The available 
bandwidths range from 32 MHz to 64 kHz in steps of 2. The maximum number of spectral 
channels however remains fixed at 256, regardless of the total observing bandwidth. The 
GMRT correlator can measure all four Stokes parameters, however this mode has not 
yet been enabled. In the full polar mode, the maximum number of spectral channels 
available is 128. Dual frequency observations are also possible at 233 and 610 MHz, 
however in this case, only one polarization can be measured at each frequency. The 
array can be split into sub-arrays, each of which can have its own frequency settings 
and target source. The correlator is controlled using a distributed control system, and 
the data acquisition is also distributed. The correlator output, i.e. the raw visibilities are 
recorded in a GMRT specific format, called the “LTA” format. Programmes are available 
for the inspection, display and calibration of LTA files, as well as for the conversion of 
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LTA files to FITS. 

The first block of the GMRT pulsar receiver is the GMRT Array Combiner (GAC) which 
can combine the signals from the user-selected antennas (up to a maximum of 30) for 
both incoherent and coherent array operations. The input signals to the GAC are the 
outputs of the Fourier Transform stage of the GMRT correlator, consisting of 256 spectral 
channels across the bandwidth being used, for each of the two polarization from each 
antenna. The GAC gives independent outputs for the incoherent and coherent array 
summed signals, for each of two polarizations. For nominal, full bandwidth mode of 
operation, the sampling interval at the output of the GAC is 16/fsec. 

Different back-end systems are attached to the GAC for processing the incoherent and 
coherent array outputs. The incoherent array DSP processor takes the corresponding 
GAC output signals and can integrate the data to a desired sampling rate (in powers of 2 
times 16 microsec). It gives the option of acquiring either one of the polarizations or the 
sum of both. It can also collapse adjacent frequency channels, giving a slower net data 
rate at the cost of reduced spectral resolution. The data is recorded on the disk of the 
main computer system. 

The coherent array DSP processor takes the dual polarization, coherent (voltage sum) 
output of the GAC and can produce an output which gives 4 terms - the intensities for 
each polarization and the real and imaginary parts of the cross product - from which the 
complete Stokes parameters can be reconstructed. This hardware can be programmed to 
give a sub-set of the total intensity terms for each polarization or the sum of these two. 
The minimum sampling interval for this data is 32 microsec, as two adjacent time samples 
are added in the hardware. Further preintegration (in powers of 2) can be programmed 
for this receiver. The final data is recorded on the disk of the main computer system. 

There is another independent full polarimetric back-end system that is attached to the 
GAC. This receiver produces the final Stokes parameters, I,Q,U & V. However, due to a 
limitation of the final output data rate from this system, it it can not dump full spectral 
resolution data at fast sampling rates. Hence, for pulsar mode observations the user 
needs to opt for online dedispersion or gating or folding before recording the data (there 
is also a online spectral averaging facility for non-pulsar mode observations). 

In addition, there is a search preprocessor back-end attached to the incoherent array 
output of the GAC. This unit gives 1-bit data, after subtracting the running mean, for 
each of the 256 spectral channels. Either one of the polarizations or the sum of both can 
be obtained. 

Most sub-systems of the pulsar receiver can be configured and controlled with an 
easy to use graphical user interface that runs on the main computer system. For pulsar 
observations, since it is advisable to switch off the automatic level controllers at the IF 
and baseband systems, the power levels from each antenna are individually adjusted to 
ensure proper operating levels at the input to the correlator. The format for the binary 
output data is peculiar to the GMRT pulsar receiver. Simple programs to read the data 
files and display the raw data - including facilities for dedispersion and folding - are 
available at the observatory and can be used for first order data quality checks, both for 
the incoherent mode and coherent mode systems. 
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19.1 Introduction 

A radio telescope in its simplest form consists of three components (see also Chapter 3), 
(i) an antenna that selectively receives radiation from a small region of the sky, (ii) a 
receiver that amplifies a restricted frequency band from the output of the antenna and 
(iii) a recorder for registering the receiver output. In this chapter we focus on the antenna, 
and in particular the antennas used for the GMRT. 

The GMRT antennas are parabolic reflector antennas. The first reflector antenna was 
invented by Heinrich Hertz in 1888 to demonstrate the existence of electromagnetic waves 
which had been theoretically predicted by J.C.Maxwell. Hertz’s antenna was a cylindrical 
parabola of f/D = 0.1 and operated at a wavelength of 66 cm.(450 MHz). The next known 
reflector antenna was that constructed in 1930 by Marconi for investigating microwave 
propagation. After that, in 1937, Grote Reber constructed the prototype of the modern 
dish antenna - a prime-focus parabolic reflector antenna of 9.1 m. diameter, which he 
used to make the first radio maps of the sky. During and after World War II, radar and 
satellite communication requirements caused great advances in antenna technology. 


19.2 Types of Antennas 

A diverse variety of antennas have been used for radio astronomy (see eg. Chapter 3) the 
principal reason for this diversity being the wide range of observing wavelengths: from 
~ 100 m to ~ 1 mm, a range of 10 s . However the most common antenna used for radio 
astronomy is the paraboloid reflector with either prime-focus feeds or Cassegrain type 
feed arrangement. 

Prime-focus parabolic antennas although mechanically simple have certain disadvan¬ 
tages, viz. (i) the image-forming quality is poor due to lower f/D ratios in prime-focus 
antennas, and (ii) the feed antenna pattern extends beyond the edge of the parabolic re¬ 
flector and the feed hence picks up some thermal radiation from ground. The Cassegrain 
system which uses a secondary hyperboloid reflector and has the feed located at the sec¬ 
ond focus of the secondary solves these problems. For Cassegrain systems the f/D ratio 
is higher and further the feed “looks” upwards and hence pick up from the ground is 
minimized. This is a great advantage at higher frequencies, where the ground brightness 
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temperature (~ 300 K) is much higher than the brightness temperature of the sky. How¬ 
ever this is achieved at the price of increased aperture blockage caused by the secondary 
reflector. 

A primary advantage of paraboloid antennas (prime focus or Cassegrain) is the ease 
with which receivers can be coupled to it. The input terminals are at the feed horn 
or dipole. A few other advantages are: (i) high gain, a gain of ~ 25 dB for aperture 
diameters as small as 10A is easily achievable, (ii) full steerability, generally either by 
polar or azimuth-elevation mounting. Further the antenna characteristics are to first 
order independent of pointing, (iii) operation over a wide range of wavelengths simply by 
changing the feed at the focus. 

Compared to optical reflectors paraboloid reflectors used for radio astronomy generally 
have a short f/D ratio. Highly curved reflectors required for higher f/D ratios result in 
increased costs and reduced collecting areas. Although the reflecting antennas are to 
first order frequency independent, there is nonetheless a finite range of frequencies over 
which a given reflector can operate. The shortest operating wavelength is determined by 
the surface smoothness of the parabolic reflector. If X mn is the shortest wavelength, 

A mn ~ cr/20 (19.2.1) 

where, a is the rms deviation of the reflector surface from a perfect paraboloid. Below 
X mn the antenna performance degrades rapidly with decreasing wavelength. The longest 
operating wavelength X mx , is governed by diffraction effects. As a rule of thumb the 
largest operating wavelength X mx is given by 

X mx < 2 L (19.2.2) 

where, L is the mean spacing between feed-support legs. At A = L the feed support 
structure would completely shadow the reflector. 


19.3 Characterizing Reflector Antennas 

One important property of any antenna is that its radiation characteristics when it is used 
as a transmitter are the same as when it is in the receiving-mode. This is a consequence of 
the well-known electromagnetic fields principle of reciprocity. Even though radio telescope 
antennas are generally used only for receiving signals, it is often simpler to characterize 
it by considering the antenna to be in the transmitting mode. Antenna terminology is 
also influenced by the reciprocity principle, for example we have been calling the dipole 
or horn placed at the focus of the reflector to receive the signal from distant sources as 
the “feed”, i.e. as though it were coupled to a transmitter rather than a receiver. 

All antennas can be described by the following characteristics (see also Chapter 3) 

1. Radiation pattern The field strength that the antenna radiates as a function of di¬ 
rection. The simplest type of antenna normally radiates most of its energy in one 
direction called the ‘primary beam’ or ‘main lobe’. The angular width of the main 
lobe is determined by the size and design of the antenna. It is usually parametrized 
by its full width at half maximum, also called its 3dB beamwidth. Weaker secondary 
ma xi ma in other directions are called side lobes. Although the pattern is a func¬ 
tion of both elevation and azimuth angle, it is often only specified as a function of 
elevation angle in two special orthogonal planes, called the E-plane and the H-plane. 

2. Directivity The radiated power in the direction of the main lobe relative to what would 
be radiated by an isotropic antenna with the same input power. A related quantity 
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called the Gain also takes into account any electrical losses of the antenna. For 
reflector antennas, one can also define an aperture efficiency which is the ratio of 
the effecting collecting area of the telescope to its geometric area. For the relation 
between the gain and the effective collecting area see Chapter 3. 

3. Polarization The sense of polarization that the antenna radiates or receives as a 
function of direction. This may be linear, circular, or elliptical. Note that when 
describing the polarization of a wave, it is sufficient to specify the polarization of the 
electric-field vector. 

4. Impedance From the point of view of the microwave circuit behind the antenna, the 
antenna can be represented as a complex load impedance. The characteristics of 
this load depend on the radiation patterns of the antenna and hence the design of 
the antenna. The goal of a good design is to match the impedance of the antenna to 
the impedance of the transmission line connecting the antenna to the receiver. The 
impedance match can be characterized by any one of the following parameters: 

• the voltage reflection coefficient, p v . 

• the return loss (in dB), R L = — 201og \p v \. 

• the voltage standing-wave ratio, VSWR = . 

5. Phase Center All horns and feeds have a phase center. This is the theoretical point 
along the axis of the feed which is the center of curvature of the phase fronts of the 
emerging spherical waves. 


19.4 Computing Reflector Antenna Radiation Patterns 

Reflector antenna radiation patterns are determined by a number of factors, but the 
most important ones are the radiation pattern of the feed antenna and the shape of the 
reflector. Parabolic reflectors have the unique feature that all path lengths from the focal 
point to the reflector and on to the aperture plane are the same. As shown in Figure 19.1, 

FP + PA = p + pcos 9' 

= p(l + cosO') (19.4.3) 

= 2 /, 


since the parabola is described in polar form by, p( 1 + cos 9') = 2/ 

When the reflector dimensions are large compared to the wavelength, geometrical op¬ 
tics principles can be used to determine the power distribution in the aperture plane. If 
the feed pattern is azimuthally symmetric, then the normalized far-held radiation pattern 
of rehector depends on 

1.7 tu = k a sin 0. where a is the radius of the aperture, k = 2tt/A, and 9 is the angle 
subtended by the far-held point with respect to the parabola’s focal axis 

2. The feed taper ,C [4],[5], which is dehned as the amplitude of the feed radiation pat¬ 
tern at the rim of the parabolic rehector relative to the ma xi mum value (assumed to 
be along the parabola axis). (Note that in standard power plots of radiation patterns 
(in dB), the edge taper T E is related to C by 7> ; = 201ogC). 
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Figure 19.1: Geometry for determining the aperture field distribution for a prime focus 
parabolic antenna. 
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3. The focal length / which determines how the power from the feed is spread over 
the aperture plane. If g(9') is the radiation pattern of the feed, r is distance in the 
aperture plane, and g(r) is the power density in the aperture plane, then we have 

g{r) dr = g{9') dO' , i.e. g{r) = g{9') (19.4.4) 

dr 

and from Figure 19.1 we have 


dff_ _ 2/ 

dr 1 + cos (9') 


(19.4.5) 


In Chapter 3 we saw that the far field is in general the Fourier transform of the aperture 
plane distribution. In the case of azimuthally symmetric distributions, this can be written 
as 


f(u) = / g(q)Jo(qu)qdq 

Jo 

where F(u) is the far field pattern, q is a normalized distance in the aperture plane, 
q = 7r(r/a), g(q) is the feed’s pattern projected onto the aperture plane as discussed above. 
A convenient parameterization of the feed pattern in terms of the taper, C is 


9 



C+{l-C) 




(19.4.6) 

(19.4.7) 


The aperture illuminations corresponding to different values of the parameter n are 
shown in Figure 19.2. The case n = 0 corresponds to a uniform aperture distribution. 

For uniform illumination the far field pattern is given by 


F(u) = 2- (19.4.8) 

(7 TU) 

Simple closed-form expressions are available for integer values of n. If the above ex¬ 
pression F(u) is denoted as F 0 (u), (since n = 0) the general form for any integer n is given 

by 


where, 


F n (u) 


n + 1 
Cn + 1 


CF 0 (u) 


1 -C 


n + 1 



(19.4.9) 


fn(u) 


2 n+1 (n + 1)! 


J n+1 (Tru) 
(■ TTu) n+1 


(19.4.10) 


Table 19.1 gives the halfpower beamwidth (HPBW), the first sidelobe level and the 
taper efficiency (see Section 19.4.1) for various edge tapers C and shape parameter n. 


From Table 19.1 (see also the discussion in Chapter 3) we find that as the edge- 
taper parameter C decreases, the HPBW increases, the first sidelobe level falls and the 
taper-efficiency also decreases. Note that C has to be less than unity since we have 
assumed a radiation pattern which decreases monotonically with increasing angle from 
the symmetry-axis (Eqn 19.4.6, Fig 19.2). 
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Figure 19.2: The shape of the aperture illumination as given by eqn 19.4.6 for different 
values of the parameter n. 


Table 19.1: Radiation characteristics of circular aperture 


Edge Taper 

n = 1 

n = 2 

T e 


HPBW 

Sidelobe 

Vt 

HPBW 

Sidelobe 

Vt 

(dB) 

C 

(rad.) 

level (dB) 


(rad.) 

level (dB) 


-8 

0.398 

1.12A/2a 

-21.5 

0.942 

1.14A/2a 

-24.7 

0.918 

-10 

0.316 

1.14A/2a 

-22.3 

0.917 

1.17A/2o 

-27.0 

0.877 

-12 

0.251 

1.16A/2o 

-22.9 

0.893 

1.20A/2a 

-29.5 

0.834 

-14 

0.200 

1.17A/2o 

-23.4 

0.871 

1.23A/2a 

-31.7 

0.792 

-16 

0.158 

1.19A/2a 

-23.8 

0.850 

1.26A/2a 

-33.5 

0.754 

-18 

0.126 

1.20A/2a 

-24.1 

0.833 

1.29A/2a 

-34.5 

0.719 
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19.4.1 Aperture Efficiency 

The “aperture efficiency” of an antenna was earlier defined (Sec 19.3) to be the ratio of 
the effective radiating (or collecting) area of an antenna to the physical area of the an¬ 
tenna. The aperture efficiency of a feed-and-reflector combination can be decomposed 
into five separate components: (i) the illumination efficiency or “taper efficiency”, rj t , 
(ii) the spillover efficiency, r/s, (iii) the phase efficiency, rj p , (iv) the crosspolar efficiency, rj x 
and (v) the surface error efficiency r] r . 


Va = lit VS Vp Vx Vr- 


(19.4.11) 


The illumination efficiency (see also Chapter 3, where it was called simply “aperture 
efficiency”) is a measure of the nonuniformity of the field across the aperture caused by 
the tapered radiation pattern (refer Figure 19.2). Essentially because the illumination is 
less towards the edges, the effective area being used is less than the geometric area of the 
reflector. It is given by 


l/(f 9 (r)dr\ 2 
/cf \g( r )\ 2 dr ’ 


(19.4.12) 


where g{r) is the aperture field. Note that this has a ma xi mum value of 1 when the 
aperture illumination is uniform, i.e. g(r) = 1. The illumination efficiency can also be 
written in terms of the electric field pattern of the feed E(9), viz. 


9 0 \C E(6)tan(0/2)d0\ 2 
Vt C ° 2 fOo \ E {9)\Hin{9)d9 ’ 


(19.4.13) 


where 9 0 is angle subtended by the edge of the reflector at the focus (Figure 19.1). 

When a feed illuminates the reflector, only a proportion of the power from the feed 
will intercept the reflector, the remainder being the spillover power. This loss of power is 
quantified by the spillover efficiency, i.e. 


C \E(9)\ 2 sm(9)d9 
Jo \E(9)\ 2 s'm(9)d9 ' 


(19.4.14) 


Note that the illumination efficiency and the spillover efficiency are complementary; 
as the edge taper increases, the spillover will decrease (and thus 775 increases), while 
the illumination or taper efficiency g t decreases 1 The tradeoff between 77 s and g t has an 
optimum solution, as indicated by the product 77 s * vt in Figure 19.3. The maximum of 
rjsVt occurs for an edge taper of about -11 dB and has a value of about 80 %. In practice, 
a value of -10 dB edge taper is frequently quoted as being optimum. 

The surface-error efficiency is independent of the feed’s illumination. It is associated 
with far-field cancellations arising from phase errors in the aperture field caused by errors 
in the reflector’s surface. If 6 is the rms error in the surface of the reflector, the surface- 
error efficiency is given by 

Vr = exp — (4nSp/X) 2 (19.4.15) 

The remaining two efficiencies, the phase efficiency and the cross polarization effi¬ 
ciency, are very close to unity; the former measures the uniformity of the phase across 

1 Recall also from Chapter 3 that as the illumination Is made more and more uniform the sidelobe level 
Increases. 
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the aperture and the latter measures the amount of power lost in the cross-polar radia¬ 
tion pattern. For symmetric feed patterns[6], rj x is defined thorough the copolar,C' p (0) and 
cross-polar patter ns, X p (9): 


C |X p (fl)| 2 sin(fl)rffl 
/o°(|C P (0)| 2 + |X p (0)P)sin(0)^ 


(19.4.16) 


where, 


C p (0) = 1/2 [E{9) + H{9)\ (19.4.17) 

X p {6) = 1/2 [E{9)-H{9)\ 


It can be seen that if one can design an antenna,having identical E(9),H(9) patterns the 
cross-polar pattern will vanish. Taking the cue from this, the feed for antenna could also 
designed with a goal to match E and H patterns at least up to the subtended angle of the 
dish edge, 9 0 . 

With this background we now proceed to take a detailed look at the GMRT antennas. 


19.5 Design Specifications for the GMRT Antennas 

The f/D ratio for the GMRT antennas was fixed at the value 0.412 based both on structural 
design issues as well as preliminary studies of various feeds radiation patterns. Since the 
antennas are to work at meter wavelengths prime focus feeds were preferred. Cassegrain 
feeds at meter wavelengths would result in unpractically large secondary mirrors (the 
mirror has to be several A across) and concomitant large aperture blockage. 

Six bands of frequencies had been identified [1] for the GMRT observations. It was 
deemed essential to be able to change the observing frequency rapidly, and consequently 
the feeds had to mounted on a rotating turret placed at the prime focus. If one were 
to mount all the six feeds on a rotating hexagon at the focus, the adjacent feeds will be 
separated by 60°. If one wants to illuminate the entire aperture, then one has to have a 
feed pattern that extends at least up to the subtended angle of the parabola edge, which 
is 9 0 = 62.5° (Note that cot(0 o /2) = 4 f/D, Figure 19.1). Hence this arrangement of feeds 
would cause the one feed to “see” the feeds on the adjacent faces. It was decided therefore 
to mount the feeds in orthogonal faces of a rotating cube. Since one needs six frequency 
bands, this leads to the constraint that at least two faces of the turret should support 
dual frequency capability. For astronomical reasons also dual frequency capability was 
highly desirable. 

One specific aspect of GMRT design is the use of mesh panels to make the reflector 
surface! 1]. Since the mesh is not perfectly reflective, transmission losses thorough the 
mesh have to be taken into account. Further, the expected surface errors of the mesh 
panels was ~ 5 mm. This implies that the maximum usable frequency is (see Section 19.2) 
~ 3000 MHz, independent of the transmission losses of the mesh. (Incidentally, since the 
mean-spacing of feed-support legs, L = 23.6 m, the lowest usable frequency is around 6 
MHz). 

Several analytical methods exist in literature to compute the transmission loss through 
a mesh as a function of the cell size, the wire diameter and the wavelength of the incident 
radiation. The one chosen for our application is has good experimental support [2,3]. At 
the GMRT, the mesh size is 10 x 10 mm for the central 1 /3 of the dish, 15 x 15 mm of the 
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Mesh 

size 

A = 21 cm. 

A = 50 cm. 

10 mm. 
15 mm. 
20 mm. 

-15.8 dB 
-11.4 dB 
-8.1 dB 

-23.3 dB 
-18.4 dB 
-14.6 dB 


Table 19.2: Transmission losses through the GMRT wire mesh 


middle 1/3 of the dish and 20 x 20 mm for the outer 1/3 of the dish. The wire diameter 
is 0.55 mm. The transmission loss for at two fiducial wavelengths for these various mesh 
sizes is given in Table 19.2. 

Each section of the dish not only has a separate mesh size but also a separate surface 
rms error. If we call these rms surface errors o\, cr 2 , a 3 and the respective transmis¬ 
sion losses (at some given wavelength) ti,t 2 ,t 3 , then the surface rms efficiency given by 
Eqn 19.4.15 has to be altered to a weighted rms efficiency: 


where, 


A-i -f- 4.2 + 4 3 
C \E(e)\2 S m(9)d0 


41 

4 2 

^3 


= exp 


= exp 


= exp 



(19.5.18) 

(19.5.19) 

(19.5.20) 


and 0 2 , 9 1 are the subtended angles of the first and second points of mesh-transition- 
zones, as illustrated in Figure 19.4 

The transmission loss gives a corresponding mesh-leakage or mesh-transmission efficiency, rj mt , 
which is given by 


where, 


Bi + B 2 + B 3 
C \E(9)\ 2 sm(9)d8 


(19.5.21) 


Bx 

B 2 

b 3 


(! - T i) 

(1 - t 2 ) 
(1 - t 3 ) 


j‘@2 

0 


j'O i 
0 2 


j'Qo 

ei 


\E(9)\ 2 sm(6)de 

\E{9)\ 2 sm(6)d6 

\E(9)\ 2 sin(9)d9 


(19.5.22) 

(19.5.23) 

(19.5.24) 


Efficiencies computed for the different GMRT feeds (using their measured pattern, 
being the input) are given in Table 19.4. 
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Figure 19.4: Schematic of the sub division of the GMRT antenna surface into 3 zones. 
The mesh size as well as the rms surface error is different in the different zones. 

19.5.1 Secondary Patterns 

The antenna pattern at 327 MHz as computed using geometric optics is shown in Fig¬ 
ure 19.5. More rigorous analytical models (the Uniform Theory of Diffraction [7]) gives the 
pattern shown in Figure 19.6. 

There is a pronounced difference seen at the side-lobe structures between these 
two models, while the primary beam shows near-identical shapes and the HPBW value 
matches to a second decimal accuracy. The computed HPBW also agrees to within mea¬ 
surement errors with the observed HPBW of the actual GMRT antennas. 


19.6 GMRT Feeds 

19.6.1 Feed Placement 

Recall that from the constraints outlined in Sec 19.5 it had been decided that the feed 
turret should be cubical in shape. Fig 19.7 shows the placement of feeds on the turret. 
The phase-centers of all the feeds are coincident with the paraboloid focus. The space 
between the turret and the feed is utilized for mounting the front-end electronics. There 
are six bands altogether, 1000 - 1450 MHz 2 , 610 MHz, 327 MHz, 233 MHz, 150 MHz and 
50 MHz. The 50 MHz feed 3 is affixed onto the feed support legs and not onto the turret. 
As such it is in focus at all times. The 610 MHz and 233 MHz feeds are mounted on the 
same turret face. 

Each type of feed - its design and performance are briefly outlined in the following 
sections. More information can be found in [8]. 


2 Note that some of the antennas have feeds that extend to 1750 MHz. 

3 Which is not yet operational 
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Secondary Patterns.(at 327 MHz) 



Angle (Deg.) 


Figure 19.5: Computed pattern (using geometric optics) of a GMRT antenna at 327 MHz. 


19.6.2 150 MHz Feed 

This feed employs four dipoles in a “boxing ring” configuration, placed above a plane 
reflector. The unique feature of the dipole is that it is wide-band i.e. has an octave 
bandwidth. It is a folded dipole with each arm being a “thick” dipole. A dipole is called 
’thin’ when its diameter, d > 0.05A. For such dipoles a sinusoidal current distribution can 
be assumed for the computation of input impedance and related radiation parameters. 

Thin dipoles have narrowband radiation characteristics. One method by which its 
acceptable operational bandwidth can be increased is to decrease the l/d ratio. For ex¬ 
ample, an antenna with a Z/d ss 5000 has an acceptable bandwidth of about 3%, while an 
antenna of the same length but with a l/d « 260 has a bandwidth of about 30%. By folding 
the dipole, one gets a four-fold increase in input impedance compared to a simple dipole. 
The 150 MHz feed also has a transmission line impedance transformer coupled to the 
excitation point [9]. 

Traditionally crossed-dipoles are used to give sensitivity to both polarizations. How¬ 
ever since a crossed-dipole configuration in this design would be extremely cumbersome, 
a “boxing ring” design was instead chosen. Here one pair of dipoles at A/2 spacing pro¬ 
vides sensitivity to one linear polarization. Another pair orthogonally oriented with respect 
to the first pair gives sensitivity to the orthogonal polarization. The overall dimensions of 
the feed are: 

• Folded dipole length : 0.39 A 


• Dipole height above reflector : 0.29 A 
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Figure 19.7: Schematic diagram showing the arrangement of the different feeds on the 
feed turret. 


• Reflector (diagonal of octagon) : 1.2 A 


The dipoles have an l/d ratio of 6.48, and the phase center was determined to be at a 
height of 100 mm above the reflector. The feed’s impedance bandwidth can be seen on 
the VSWR plot of Figure 19.8 

The usable bandwidth for a feed is given approximately by the range for which SWR 
< 2.0. By this criteria, the frequency coverage of the 150 MHz feed is from 117 MHz to 
247 MHz, i.e. a bandwidth of 130 MHz, or 86% bandwidth. The radiation pattern gives 
an edge taper, T E = —9 dB. 
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Folded Thick Dipole - 150 MHz. 



Figure 19.8: VSWR for the 150 MHz feed. 


One undesirable feature of this feed is the high value of cross-polarization, as com¬ 
pared to that at other frequencies (see Figure 19.9) 4 . The cross-polar peak for 150 MHz. 
is -17 dB and the on-axis cross polarization is also at about the same level. 

One-pair of outputs from the dipoles which are parallel to each other are connected 
to a power-combiner, whose output goes to one port of the quadrature hybrid (which 
adds two linear polarized signals to yield one circular polarized signal). Similarly the 
orthogonal pair of dipoles are connected to the other port of the hybrid. Both the power 
combiners and the quadrature hybrid are mounted inside one of the front-end chassis, 
placed behind the feed. 


19.6.3 327 MHz Feed 

Generally a dipole has a broader H pattern than its E pattern (the E pattern being in the 
plane containing the dipole). Recall from the discussion in section 19.4.1 that for good 
cross-polarization properties it was essential to have matched E and II plane patterns. 
An elegant method for achieving this pattern matching was given by P.S.Kildal [10], and 
involves placing a beam forming ring (BFR) above the dipole 5 . The conducting ring is 
placed above the dipole in a plane parallel to the reflector and is supported by dielectric 
rods. The beam forming ring compresses the H-plane pattern while it has no significant 
effect on the E-plane. 

4 Note that the cross polar pattern was measured using the standard technique outlined in [4 : pp. 177-79]. 
The cross-polar levels are measured with respect to a co-polar maximum of 0 dB. 

5 This design has been christened ’Kildal Feed' in the local jargon. 
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Kildal Feed - 327 MHz, 



Frequency (MHz.) 

Figure 19.10: The VSWR as a function of frequency for the 327 MHz feed. 


The optimum dimensions of the dipole, BFR and reflector were arrived at by careful 
measurements done on a scaled-up version (i.e. at 610 MHz) and a follow-up measure¬ 
ments on a prototype 327 MHz model. The values arrived at were : 

• Reflector diameter : 2.2A. 

• Height of dipole above reflector : 0.26A. 

• BFR diameter : 1.22A. 

• BFR height above reflector : 0.51A. 

The measured phase center is at 26 mm above the reflector for both E and H- planes. 
Crossed dipoles are employed for dual polarization. The 327 MHz feed actually deviates 
slightly from the original Kildal’s design - there are sleeves over the dipoles. These sleeves 
increase the bandwidth of the feed [5]. The VSWR plot for the 327 MHz feed is given in 
Figure 19.10. 

For SWR < 2.0, the bandwidth is 138 MHz.(286 to 424 MHz.) The measured antenna 
pattern is given in Fig 19.11. The edge taper, T E is —12.2 dB. Fig 19.9 shows the cross- 
polar pattern. It is seen that a cross-polar ma xi mum of -27.5 dB (mean value) has been 
achieved. 

The linear polarized outputs of the dipoles are mixed in a quadrature hybrid at one of 
the front-end chassis to produce two circular polarized (both left and right) signals, which 
go further into the amplifying, signal conditioning circuits of front-end Electronics. 
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KILDAL FEED - 327 MHz. 
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Figure 19.11: The measured antenna pattern at 327 MHz 

19.6.4 Dual-Frequency Coaxial Waveguide Feed 

The 610 MHz and 233 MHz feeds are dual frequency coaxial feeds. The single most 
attractive feature of coaxial waveguide feed is its’ multi-frequency launching capability. 
Simultaneous transmission or reception of well separated frequencies is possible. Coaxial 
feeds have been used as on board satellite antennas to provide coverage at three separate 
frequency bands [11]. Coaxial feeds have also been used at the WSRT (operated by NFRA, 
The Netherlands). The prime focus feed system has at WSRT has two separate multi¬ 
frequency coaxial waveguides, covering 327 MHz, 2300 MHz in one and 610 MHz, 5000 
MHz in another [12],[13]. 

The design of the GMRT 610 MHz/233 MHz waveguide feeds is based on an exhaustive 
theoretical analysis of the design of coaxial waveguide feeds [14], [15]. A constraint in such 
multi-frequency designs is that adjacent frequency bands should not overlap to within an 
octave. Thus at the GMRT either the 150 MHz or the 233 MHz could have been combined 
with 610 MHz. However the former choice was rejected since it resulted in unwieldy 
dimensions of the feed. 

The fundamental mode of propagation in coaxial structures is TEM, hence the radiated 
field component along the axis is zero everywhere. Obviously for a feed this is the most 
undesirable characteristic. So propagation by an alternate mode (single or multiple) is 
essential. Coaxial waveguides must then be forced to radiate in TEn mode. This can be 
achieved simply by exciting the probes in phase opposition 6 . 

In the dual frequency construction the outer conductor of the 610 MHz serves as the 
inner one for the 233 MHz. Quarter wavelength chokes are provided in both the frequency 

®Low loss baluns are essential In such designs. 
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Dimensions 

610 MHz Coaxial 

233 MHz Coaxial 

Aperture diameter 

0.9 A 

0.85 A 

Waveguide cavity length 

0.95 A 

0.73 A 


Table 19.3: Dimensions of the 610/233 MHz coaxial feed. 


Coaxial Wg. Feed — 610 MHz. 



Figure 19.12: The VSWR as a function of frequency for the 610 MHz feed. 


parts to cut down the surface currents on the outer conductor and thereby ensure pattern 
symmetry. The waveguide feeds have two pairs of probes. One pair supports a given plane 
polarization while the orthogonal pair supports the orthogonal polarization. Similar to 
the dipole feed discussed in the previous section, a quadrature hybrid at the back-end 
of the coaxial feed is used to convert the linear polarization to circular polarization. The 
rear-half of the 610 MHz feed, separated by a partition disc, is utilized to accommodate 
the baluns, quadrature hybrids and low-noise amplifiers of 610 MHz and the baluns of 
233 MHz. The overall dimensions of the feed are given in Table 19.3 

The phase center is not at the aperture plane, but at a point 60 mm in front of the 
aperture. A similar displacement of the phase center is also seen in the WSRT coaxial 
feeds [13]. Fig 19.12 shows the VSWR plot for an optimized probe geometry at 610 MHz 
For SWR < 2.0, the band goes from 580 MHz to 707 MHz, i.e. a total bandwidth of 127 
MHz. The feed patterns measured at 610 MHz are shown in Fig 19.13; the edge taper is 
-9.8 dB. The cross-polar ma xi mum is —22.8 dB. 

Fig 19.14 shows the VSWR plot of 233 MHz- part of the coaxial feed. 

For SWR < 2.0, the bandwidth is 12 MHz, i.e. this feed is rather narrow as compared 
to all other frequency bands. The effect of the inter-coupling of radiated power between 
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Coaxial Wg. Feed -610 MHz. 



Figure 19.13: The feed pattern of the 610 MHz feed. 


the two frequencies of the coaxial feed on the radiation patterns has been studied. The 
main lobe does not show any significant change due to the presence of the other coaxial 
waveguide part. 


19.7 1000-1450 MHz Feed 

This feed was designed and constructed by the Millimeter Wave Laboratory of the Ra¬ 
man Research Institute. It is of the corrugated horn type - known for its high aperture 
efficiency and very low cross-polarization levels. In any horn, the antenna pattern is 
severely affected by the diffraction from the edges which can lead to undesirable radi¬ 
ation not only in the back lobes but also in the main lobe. By making grooves on the 
walls of the of a horn, the spurious diffractions are eliminated. Such horns are called 
“Corrugated horns”[4]. Our feed at 1420 MHz. has fins instead of grooves, since the whole 
assembly is made out of brass sheets. The flare-angle of the horn is 120°. The dimensions 
of the feed are: 


• Aperture diameter : 3.65 A 

• Horn length : 4.48 A 

The phase center has been found out to be at the apex of the cone - at a depth of 200 mm 
from the aperture plane. This feed has an impressive bandwidth: 580 MHz, starting from 
1000 MHz to 1580 MHz, as can be seen from Fig 19.15 
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Coax.Wg. Feed - 333 MHz. 



Figure 19.14: The VSWR as a function of frequency for the 233 MHz feed. 


Radiation patterns, including the cross-polar pattern is shown in Fig 19.16. 

The edge taper is —19 dB and the cross-polar peak is —24 dB. The front-end electronics 
is housed in a rectangular box,on the back side of the horn, forming one integral unit. 
The entire band is divided into 4 subbands, each 140 MHz wide and centered on 1390, 
1280, 1170 and 1060 MHz. There is also a bypass mode in which the entire bandwidth 
is available. 


19.8 GMRT Antenna Efficiencies 


The efficiency relations shown in Section 19.5, have not considered the effect of aperture 
blockage by feeds and feed-support frames (“ quadripod legs" in GMRT-parlance). Simple 
geometrical optics based models for such computation exist, [16] which are used along 
with GMRT-specific efficiency relations, to produce the following table. Limitations of 
this model are highlighted in [17]. 

Some of the loss terms can be expressed as equivalent noise temperatures (see Chap¬ 
ter 3). The spillover temperature is given by (see also Eqn 19.4.14) 


Jgj 2 \E(9)\ 2 sm(0)d9 
Sp 9 /;|J5(0)Psm(0)d0 


(19.8.25) 


where T g is the ground temperature. Considering the reflectance of soil at microwave 
frequencies, it is presumed as 259° K. 

Similarly, the mesh-leakage T m i, scattered radiation by the feed- support frames T sc , 
can also be expressed in terms of T g . The overall system temperature (see Chapter 3) is 
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Corrugated Horn - 1420 MHz. 



Frequency (MHz.) 

Figure 19.15: The VSWR as a function of frequency for the 1420 MHz feed. 


the sum of all these and the receiver noise temperature, T r and the sky temperature, T sky , 
which is assumed to be, 


T sky = 3.0 + 20-(408//) 275 , (19.8.26) 

where / is the frequency of the received signal (in MHz). Hence 

T sys = T r + T sky + Ts p + T m i + T sc . (19.8.27) 

Finally the figure-of-merit of any radio antenna, is the gain-by-system temperature ratio, 
G/T sys , expressed as : 


G=—(19.8.28) 

2k 

where S is flux density in units of Jansky, A p , is the physical area of the parabolic dish 
and r] a is the overall aperture efficiency. For a 1 Jy. source at the beam of the antenna 
and value of Boltzmann’s constant k included in the above relation, 

° = f™- l19 - 8 - 29 ' 

Hence, the ratio G/T sys is expressed in units of Jy -1 . 

A summary of the relevant parameters for the GMRT antennas is given in Table 19.4. 
These have been computed based on the following assumptions. 
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Corrugated Horn - 1420 MHz. 



Figure 19.16: Radiation Pattern of the 1420 MHz feed. 


1. T r = 100° K for 150,233 and 327 MHz bands; 50° K for 610 MHz and 35° K for the 
1000 to 1400 MHz bands. 

2. The surface rms, a lt a 2 , cr 3 values are 8.0, 9.0, and 14.0 mm respectively. 

The agreement between the observed HPBW, gain and system temperature and the com¬ 
puted values is in general quite good. 
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Eff. 

Frequency (MHz) 

150 

233 

327 

610 

1000 

1200 

1400 

Tap Eff. 

0.689 

0.823 

0.715 

0.775 

0.566 

0.533 

0.592 

Spill. Eff. 

0.952 

0.799 

0.944 

0.835 

0.967 

0.971 

0.971 

Mesh Eff. 

0.999 

0.999 

0.998 

0.991 

0.943 

0.941 

0.94 

RMS Eff. 

0.997 

0.992 

0.986 

0.948 

0.88 

0.835 

0.78 

Aper. Eff. 

0.652 

0.651 

0.664 

0.608 

0.452 

0.405 

0.422 

Tsys(° K) 

428 

229 

152 

92 

65 

77 

62 

77T x 10 ~ 3 

0.877 

1.64 

2.53 

3.81 

4.04 

3.02 

3.17 

HPBW 

2° 52’39” 

1° 51’06” 

1° 21T5” 

0° 42’48” 


0° 19’26” 



Table 19.4: Calculated aperture efficiencies and system temperatures for the GMRT an¬ 
tennas. 


6. Kildal, P-S., “Factorization of the Feed Efficiency of Paraboloids and Cassegrain An¬ 
tennas”, IEEE Trans.on Ant.& Propg., Vol.AP-33, No.8, (1985). 

7. Krishnan, T., “Analysis of TIFR / GMRT 45 m. dish performance at 327 MHz.”, 
HAE/HSS Report 010/92, Sept. 1992. 

8. Sankar, G., Swarup, G., Ananthakrishnan.S., Sankararaman, M.R., Sureshkumar 
.S. and Izhak.S.M. “Prime focus feeds for GMRT Antenna’IAU - 6th Asian Pacific Re¬ 
gional Meeting on Astronomy, IUCAA, Pune. Aug. 1993. 

9. Guillou.L., Daniel.J-P., Terret.C., Madani.A., “Rayonnement d’un Doublet Replie 
Epais”, Annales des Telecommunications, tome 30, nr 1-2. 

10. Kildal, P-S., and Skyttemyr, S.A., “Dipole-Disk Antenna with Beam-Forming Ring”, 
IEEE Trans.on Ant.& Propg., Vol.AP-30, No. 4, (1982). 

11. Livingston, M.L., “Multi-frequency Coaxial Cavity Apex Feeds”, Microwave Journal, 
(Oct. 1979) pp.51-54. 

12. Van Ardanne, A., Bregman, J.D., Sondaar, L.H., Knoben, M.H.M., “A Compact 
Dual Polarized Coaxial Feed at 327 MHz.”, Electronics Letters, Vol.20, No. 18, (1984) 
pp. 723-724. 

13. Jeuken, M.E.J., Knoben, M.H.M, and Wellington, K.J., “A Dual Frequency, Dual 
Polarized Feed for Radioastronomical Applications”, Rechnernetze und Nachricht- 
enverkehrstheorie, NTZ, Heft:8, (1972) pp.374-376. 

14. Shafai, L. and Kishk, A.A., “Coaxial Waveguides as Primary Feeds for Reflector An¬ 
tennas and comparison with Circular Waveguides”, AEU, Band:39, Heft 1, (1985) 
pp.8-14. 

15. Sankar, G. and Praveenkumar, A., “Dual Frequency Coaxial Waveguide Feed - De¬ 
sign calculations”, Int.Tech.Report, AG-02/90, GMRT-TIFR, Pune. 

16. Fisher, J.R., “Prime-focus Efficiency, Blockage, Spillover and Scattering Calculations 
on the HP 9825A Calculator”, EDIN Report. 174, NRAO, Nov. 1976. 

17. Chengalur, J.N. ’’Aperture Efficiency Calculations for GMRT Dishes”, NCRA-TIFR 
Int.Tech.Report, Dec. 1993. 
















Chapter 20 

The GMRT Servo System 


V. Hotkar 

20.1 Introduction 

The GMRT servo system is a dual drive position feed-back control system. It can track a 
source in the sky with an rms accuracy of ~ 0.5 . To realise such a system practically, the 
expertise from various engineering disciplines are put to work. In order to understand 
such a system, one has to become familiar with the theory of feedback control systems 
as well as its application for position control. This chapter discusses these issues. The 
material is presented in an simplified form and an effort has been made to use, wherever 
possible, graphical explanations instead of a mathematical treatment. 


20.2 Objectives of the GMRT Servo System 

The servo systems used for position control of the radio telescopes must meet following 
objectives. 

1. Ability to point anywhere in the sky. 

2. High pointing & tracking accuracy. 

3. Able to accelerate rapidly in the direction of source. 

4. Able to manoeuver remotely 

The first requirement is met by making a two axes mount for the antenna. For large 
antennas like those used in the GMRT (i.e. with weight in excess of 80 tones) an alt¬ 
azimuth mount is preferred. In such a mount the antenna can be moved in two axes 
viz. azimuth and elevation. The azimuth axis movement is parallel to the horizon, while 
elevation axis movement is normal to the horizon. Alt-az mounts are mechanically simple, 
yet very stable. 

Radio telescope antennas are required to point within +/- 10HPBW at any given wave¬ 
length of operation of the antenna. This means that the pointing accuracy of the antenna 
should be fairly good. The following issues are of concern when trying to achieve high 
accuracy pointing or tracking: 
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1. Structural deformation due to gravity. 

2. Structural vibrations/deformations due to wind forces. 

3. Servo positioning error. 

Note that not only can the reflecting surface of the antenna be affected by gravity, 
the feed support legs too could deform, leading to a displacement of the feed from the 
focus of the antenna. The GMRT antennas are built using a novel technique (nicknamed 
“SMART”) involving a stainless steel mesh which is attached to rope trusses by wires 
which are are tensed appropriately in order to achieve the desired parabolic reflecting 
surface. This results in a dramatic reduction in the gravitational and wind loading on the 
structure, as well as in the total weight of the dish. 


20.3 The GMRT Servo System Specifications 

A summary of the GMRT Servo Specifications is given in Table 20.1. 


Table 20.1: GMRT Servo system summary 


Dish mount 
Drive 

Dish movement 

Dish slewing speed 

Minimum Tracking speed 

Maximum Tracking speed 

Tracking & pointing accuracy 
Gear reduction ratio 


Altitude-Azimuth mount. 

Dual drive in counter torquing mode. 
Azimuth +270 to —270 deg. 

Elevation 15 deg to 110 deg. 

Azimuth 30 deg/min. 

Elevation 20 deg/min. 

Azimuth 5 arcmin/min. 

Elevation 5 arcmin/min. 

Azimuth 150 arcmin/min. 

Elevation 15 arcmin/min. 

1 arcmin for wind speed <20 kmph. 
Azimuth 18963. 


Antenna acceleration 
Design Wind speed 


System operating voltage 
Antenna parking 


Elevation 25162. 

Full speed in > 3 sec for both axes. 

40 kmph Operational. 

80 kmph Parking. 

133 kmph survival. 

415 VAC, 3 Phase, 50 Hz. 

Antenna parking using 96 V DC battery. 


20.4 Control System Description 

The GMRT servo system is a closed loop position feed-back control system, designed for 
tracking & positioning of the GMRT radio telescopes. The use of dual drive and counter- 
torque, eliminates non-linearity due to back-lash associated with the gearbox. 

20.4.1 Closed Loop Control Systems 

All automatic control systems use —ve feedback for controlling a physical parameter like 
position, velocity, torque etc. The parameter which has to be controlled is sensed by a 
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suitable transducers and fed back to the input, for comparison with the reference value 
(see Figure 20.1. This subtraction of the sampled output signal with that of reference in¬ 
put is called as —ve feedback. The difference signal, called the “error” is then amplified to 
drive the system (referred to as actuation ) in such a manner that the output approaches 
the set reference value. In other words the system is designed to minimize the error 
signal. 

All practical loads have inertia and spring constants due to which there is a delay 
in actuation. Hence, even though a system may be designed for —ve feedback, due to 
inherent time lags, the feedback may turn into +ve feedback at certain frequencies. If the 
loop gain is more than unity at some frequency at which the feedback is +ve, the system 
will oscillate. Hence, in designing control systems great care has to be taken to avoid 
such situations. 



(Gs) = Compensator transfer function 
Hs = Plant transfer function 


Figure 20.1: Closed loop control system. 


20.4.2 Principles of Position Control 

For controlling a heavy load, one could, (as illustrated in Figure 20.2) use three nested 
feedback loops viz. a position loop, a velocity loop and a current loop. This configuration 
allows independent tunning of the loop parameters without affecting the adjacent loop. 
A current amplifier is used to amplify the current for driving the motor. The position is 
sensed by a suitable transducer. The velocity of the antenna is generally sensed by the 
tachometers mounted on the motor shaft. 



GB = Gear box. ENC = Encoder. PLA = Position Loop Amplifier. RLA = Rate Loop Amplifier. 


Figure 20.2: Three nested feed back loop. 

The block diagram shown above can not be directly used in all position control appli¬ 
cations. The back-lash which is inherent in any gear box, introduces a non-linearity in 
the position loop. Such a system exhibits a phenomena called as “limit cycle hunting”. 
This affects the positioning accuracy of the antenna. 
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20.4.3 Position Loop Amplifier 

The position loop amplifier (PLA) has two inputs viz. command input and feedback input. 
In an automatic position control system, the output of the position sensor is filtered, 
scaled and then applied to the PLA. The command signal is applied to the other input 
of the PLA. The PLA (which can be either analog or digital) subtracts its two inputs to 
generate an error signal. This error signal is then applied to the compensator. 

A compensator is designed depending on the application. For example the GMRT 
antennas are used for tracking of stellar radio sources which are moving at constant 
speed in the sky (15°/hr, the speed of the earth’s rotation). For such an application, a 
position system having type II response is required. With a type I position compensator 
and with the use of rate loop in the position control, the overall system response is of 
type II ■_ 


Type of position system 

Pointing Error 

Tracking Error 

Type O 

Finite 

Finite 

Type I 

Zero 

Finite 

Type II 

Zero 

Zero 


Parameters like the structural natural resonant frequency (Wc) and the frictional (Be) 
constants of the structure are required for the design of the position loop compensator 
. The main objective while designing the position compensator is that it should offer 
enough attenuation at the natural resonant frequency of the structure. 

The output of the PLA acts as velocity command. If the target’s angular position is far 
removed from the current position, then the error is very large and could saturate the PLA 
. The saturation of the PLA is considered as a fixed velocity command to the rate loop. 
The rate loop moves the antenna with a constant velocity towards the target position. As 
the antenna approaches the target position, the error at the output of the PLA goes on 
reducing, which commands the rate loop to reduce the speed of the antenna. When the 
antenna is at the target position the error at the output of the PLA goes to zero, which 
translates to a zero speed command to the rate loop. The sign of the error signal at the 
output of the PLA decides whether the antenna is to be moved forward or reverse . 


20.4.4 Rate Loop Amplifier 

The function of the Rate Loop Amplifier (RLA) is to control the velocity of the antenna. In 
position control applications, the rate loop improves the transient response of the position 
loop by adding a pole in the position loop. 
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Figure 20.3: Rate loop amplifier. 

The output of the PLA which acts as a velocity command, is applied to the one input 
while tachometer signal is applied to the other input. The RLA subtracts both the input 
signals and generates an error signal which is then applied to the compensator. For 
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position control applications like the GMRT the rate loop compensator can be of phase 
flag type (Type O) which avoids limit cycle hunting. The electro-mechanical time constant 
of the combined motor and load determines the bandwidth of the compensator. The 
output of the RLA acts as a command to the current loop. If the command speed is more 
than the actual speed, then the error at the output of the RLA becomes large, which 
commands the current loop to pass more current through the motor. 

For GMRT antennas, where a dual drive system is used, the rate loop controls the 
antenna velocity by sensing the tacho signal from both the motors. Both these tacho 
signals are averaged and then applied to RLA as feedback. A voltage corresponding to 
torque bias is added/subtracted at the output of the rate loop, to generate two current 
commands. These two current commands are applied to the two current loop amplifiers, 
for controlling currents in accordance with the rate loop. 


20.4.5 Current Loop Amplifier 

The function of the Current Loop Amplifier (CIA) is to control/regulate the current of 
the motor which results in the control of the motor torque. The current of the motor is 
sensed either by a resistive shunt or with a Hall effect sensor. The control of over current 
should be fast in order to protect the power semiconductors during starting/stopping of 
the motor or in the event of fault. Also the steady state error of the current should be 
zero (as any error in torque affects the speed). These requirements can be met by using a 
“PI” (Proportional Integral) compensator. 



Motor 


Shunt 


Figure 20.4: Current loop amplifier. 

The current signal is filtered, scaled and then applied to the CIA. The output of the 
RLA which acts as a current command, is applied to the other input. The CIA subtracts 
both the input signals and generates the error signal. The error signal is applied to 
the proportional-integral (PI) compensator. In a 3-phase SCR amplifier like one used at 
the GMRT, the motor current has a 150 Hz component along with the DC component. 
As the current is sampled and fed back to the loop amplifier, the 150 Hz component 
of the current gets injected into the loop. This is like injecting a noise into a system. 
In order to avoid oscillations in the loop, the current loop compensator is designed to 
heavily attenuate the 150 Hz signal component. The filtered output of the error amplifier 
is applied to the 4-quadrant power amplifier. 


20.5 Servo Amplifiers 

Servo amplifiers are 4-quadrant, regenerative power amplifiers, supplying appropriate 
power to the motor as commanded by a control voltage. These amplifiers are capable of 
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Table 20.2: Servo amplifier specifications 


Type 

Control Type 
Input Volts 
Command Volts 
Maximum Current 
Protection 


3-Phase, SCR based, 4-quadrant fully regenerative. 
Phase angle control with current loop. 

275VAC L-L, 50 Hz, 3-Phase, 4-wire. 

+/- 10 Volt. 

+/- 80 Amp. 

Over current & over speed. 


suppling energy to the load, as well as absorbing energy from the load. They are designed 
to convert the kinetic energy of the combined motor load, into electrical energy while the 
load is decelerating. 

The GMRT servo amplifier is a three phase, half wave, four-quadrant, fully regener¬ 
ative, SCR CLA for the control of permanent magnet DC brush type motors. A CLA is a 
device, which keeps the current through the motor proportional to a commanded input 
signal. 


20.6 Servo Motors 

Servo motors are special category of motors, designed for applications involving position 
control, velocity control and torque control. These motors are special in the following 
ways: 

1. Lower mechanical time constant. 

2. Lower electrical time constant. 

3. Permanent magnet of high flux density to generate the field. 

4. Fail-safe electro-mechanical brakes. 

For applications where the load is to be rapidly accelerated or decelerated frequently, 
the electrical and mechanical time constants of the motor plays an important role. The 
mechanical time constants in these motors are reduced by reducing the rotor inertia. 
Hence the rotor of these motors have an elongated structure. For DC brush type motors, 
the permanent magnets are mounted on the stator, while the armature conductors are on 
the rotor. The rotating conductors make contact with the stationary electrical source via 
a brush-commutator assembly. A DC tacho is mounted on the motor shaft, for indicating 
the shaft speed in-terms of a voltage. These motors also come with fail-safe electro¬ 
mechanical brakes. In the event of failure of the utility mains, the antennas are stopped 
by these brakes. 

20.7 Gear Reducers 

Generally the motors which are commercially available deliver low torque at high speed 
and can not be used for driving the load directly. Gear reducers are used to increase 
the torque so as to meet the torque demand of the load . For servo application i.e. for 
positioning the load, the gear reducers should possess following characteristics. 


1. Bi-directional energy flow 
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Table 20.3: Servo motor specifications 


Type 

Horse Power rating 
Rated motor voltage 
Rated motor current 
Rated motor speed 
Continuous stall torque 
Peak Torque 
Torque Sensitivity 
Back E.M.F. Constant 
Armature resistance 
Armature inductance 
Tacho sensitivity 


DC brush type, permanent magnet field. 
6 HP. 

150 V (DC). 

80 Amp (Continuous). 

2250 rpm. 

47 N-m. 

Ill N-m. 

0.56 N-m / Amp. 

59 V / krpm. 

0.045 Ohm. 

0.33 mH. 

17 V/ krpm. 


2. Low back-lash 

3. Low moment of inertia 

4. High efficiency 

The bi-directional reducers means that, the energy can be transferred from input to 
output as well as from output to input. During deceleration, the motor is forced to act 
like a generator, converting the kinetic energy of the load into electrical energy. The 
deceleration of the load is decided by the rate of consumption of the electrical energy 
produced. Planetary gear boxes meets this requirement and are hence used at the GMRT. 


20.8 Position Sensors 

Optical position sensors are the sensors of choice for highly accurate positioning of an¬ 
tennas. There are two broad styles of the encoders viz. incremental and absolute. An 
incremental encoder is made of a glass disc and a light interrupter. Transparent and 
opaque markings are put on the outer periphery of the glass disc. Light emitted from a 
lamp or LED is interrupted by the glass disc and received by a photo diode. As the disk 
rotates, the light falling on the photo detector is interrupted by the opaque markings, 
leading to pulses in the photdetector. These pulses are counted to determine the change 
in position. The disk has an index marker, is used to provide a reference. Though in¬ 
cremental encoders are simple in construction and provide a cheap solution for position 
sensing, they suffer from one drawback. On the failure of the power to the encoder or the 
electronic circuit, the electronic counter looses its count value, and hence all information 
as to the current position. Hence, upon the resumption of the power to the antenna, one 
would need to move the antenna until the index marker pulse is received, a procedure 
called “homing”. For large antennas like those at the GMRT, this is unacceptable and 
hence absolute encoders have to be used. 

In an absolute encoder, a pattern corresponding to a gray code is printed on the glass 
disc. The glass disc moves through a light emitter and a set of light detectors. The 
number of light detectors are in proportion with the number of bits of the encoders. This 
enables the encoder to generate a binary word corresponding to the angular position of 
its shaft. The electronics housed inside the encoder converts the gray code to the natural 
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Table 20.4: 

Type 

Resolution 
Max . Shaft speed 
Max. Data rate update 
Illumination 
Input Power 
Output Code 
Output data format 
Data transmission 
Serial output 
Count Direction 
Operating temp. 


Encoder specifications. 

Optical, absolute shaft encoder. 

17 bit (10 arcsec). 

600 rpm. 

100 kHz. 

light emitting diode. 

+ 5V DC at 300 mA. 

Natural binary. 

Serial. 

RS -422 differential line driver. 

MSB first, LSB last & then parity bit. 
CW increasing. 

0° C to +70° C. 


binary . Also the parallel code gets converted into serial format for transmitting over long 
distance cable. The encoder is directly mounted on each axis of an antenna. 


20.9 Dual Drive 

For a large antenna, the torque required to move the antenna is high, hence the large 
ratio gear reducers are used to meet the required torque demand. It is almost impossible 
in practice to manufacture a gear box which can deliver a large power with no back¬ 
lash. Any effort to reduce back-lash by tight coupling of pinions increases the friction of 
the gear box which reduces its efficiency. With the use of large gear ratios the backlash, 
hysteresis, and between the motor shaft and the load shaft increases. With the increase in 
these parameters the nonlinearity in the position loop increases, which leads to position 
loop instability. There are various ways to reduce the back-lash mechanically but they 
are inefficient and are unsuitable for a giant antennas like those at the GMRT. Instead 
one uses a dual drive. Here a pair of motors, gearbox and pinion are used to drive the 
common load. 

Two amplifiers individually drive the motors. When the load is to be held at some 
position, the torque produced by two motors are equal and opposite, thereby eliminating 
the backlash. The net torque on the load is zero hence it does not move. For a slight 
movement of the load in a given direction, one motor increases its torque in that direction 
while the other reduces its torque. The load will be subjected to a net torque which causes 
small movement of the load. 


20.10 Digital Controller 

The digital controller for GMRT antennas, is built around Intel’s 8086 processor running 
at 8 MHz and is called as the “Station Servo Computer” The 8086 is a bus master, con¬ 
trolling two slave processors 8031, for analog and encoder interface. The position loop of 
both the axes of the GMRT servo system is implemented digitally in this servo computer. 
The elevation and azimuth axes angles along with time, are fed to the servo computer by 
the antenna base computer (ABC, see Section 24.2.4). The servo computer computes the 
error of both the axes and performs necessary filtering (compensation). The compensator 
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output is converted into analog signal by using 16 bit DAC and then applied to the rate 
loop. 



Figure 20.5: Dual drive position control system. 


For the digital implementation of a position loop, the sampling rate must be large 
enough. The “S” domain transfer function of the compensator is converted into a “Z” 
domain transfer function, by using the “Tustins approximations”. The Z-domain transfer 
function is further converted into a difference equation, to be solve recursively at a regular 
interval. Tustin proposes that the sampling frequency must be greater than 10 times the 
compensator bandwidth. With 1.5 Hz as a structural resonant frequency of the GMRT 
antennas, the position loop bandwidth can be around 0.4 Hz to 0.5 Hz . For a 0.5 Hz 
loop bandwidth the sampling rate should be more than 5 Hz. This sets the lower limit of 
the sampling rate. The upper limit of the sampling rate is determined by the processor 
speed, other tasks of the processor, the transport lag etc. We have chosen 10 Hz as a 
sampling rate. The processor is interrupted at regular interval of 100 ms to run the real 
time programme. 

20.11 Servo Operational Commands 

The central control station sends commands to a group of antennas via an optical fiber 
link (see Chapter 24). Some of the operational commands, related to the servo is described 
next. 

1. COLDSTART: On receiving this command, the servo system removes the stow-lock 
pins, releases the motor brakes, enables the servo amplifiers, holds both the axes at 
the current angle and waits for next command. 

2. MV argl,arg2: Move along the azimuth and elevation axes to the angles argl and 
arg2 respectively. The servo system releases the motor and moves the antenna. 
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3. TRACK argl,arg2,arg3: Track in azimuth and elevation axes with the destination 
angle as argl and arg2 and the time parameter as arg3. 

4. HOLD: Holds both the axes. On receiving this command, servo system releases 
brakes of both axes motors and holds the antenna in position. 

5. STOP: Stops both the axes. On receiving this command, servo system disables 
amplifiers & applies brakes to both axes motors. 

6. CLOSE: Close the observations. On receiving this command, servo system positions 
the elevation axis to 90:00:00 deg., disables all amplifiers, applies brakes to all 
motors & inserts the stow-lock pin. 

7. STOW: Inserts the stow-in pin in the elevation axis and locks the axis. 

8. SWRELE: Releases stow-in pin from the elevation axis and frees the axis. 

9. RSTSERVO: Resets the station servo computer. 



